Saves statistics and BY variables in an output data set.
OUTPUT SAS-data-set> statistic-keyword-1=name(s)
<...statistic-keyword-n=name(s)> <percentiles-specification> ; |
- OUT=SAS-data-set
- identifies the output data set. If SAS-data-set does not exist, PROC UNIVARIATE creates it. If you omit OUT=, the data set is named DATAn, where n is the smallest integer that makes the name unique.
- statistic-keyword=name(s)
- specifies a statistic to store in the OUT= data set and names the new variable that will contain the statistic. The available statistical keywords are
Descriptive statistic keywords |
|
| CSS | CV | KURTOSIS |
| MAX | MEAN | N |
| MIN | MODE | RANGE |
| NMISS | NOBS | STDMEAN |
| SKEWNESS | STD | USS |
| SUM | SUMWGT | VAR |
Quantile statistic keywords |
|
| MEDIAN | P1 | P5 |
| P10 | P90 | P95 |
| P99 | Q1 | Q3 |
| QRANGE |
|
|
Robust statistic keywords |
|
| GINI | MAD | QN |
| SN | STD_GINI | STD_MAD |
| STD_QN | STD_QRANGE | STD_SN |
Hypothesis testing keywords |
|
| NORMAL | PROBN | MSIGN |
| PROBM | SIGNRANK | PROBS |
| T | PROBT |
|
See SAS Elementary Statistics Procedures and Statistical Computations for the keyword definitions and statistical formulas. To store the same statistic for several analysis variables, specify a list of names. The order of the names corresponds to the order of the analysis variables in the VAR statement. PROC UNIVARIATE uses the first name to create a variable that contains the statistic for the first analysis variable, the next name to create a variable that contains the statistic for the second analysis variable, and so on. If you do not want to output statistics for all the analysis variables, specify fewer names than the number of analysis variables.
- percentiles-specification
- specifies one or more percentiles to store in the OUT= data set and names the new variables that contain the percentiles. The form of percentiles-specification is
PCTLPTS=percentile(s) PCTLPRE=prefix-name(s) suffix-name(s)> |
- PCTLPTS=percentile(s)
- specifies one or more percentiles to compute. You can specify percentiles with the expression start TO stop BY increment where start is a starting number, stop is an ending number, and increment is a number to increment by.
Range: | any decimal numbers between 0 and 100, inclusive |
Example: | To compute the 50th, 95th, 97.5th, and 100th percentiles, submit the statement
output pctlpre=P_ pctlpts=50,95 to 100 by 2.5; |
- PCTLPRE=prefix-name(s)
- specifies one or more prefixes to create the variable names for the variables that contain the PCTLPTS= percentiles. To save the same percentiles for more than one analysis variable, specify a list of prefixes. The order of the prefixes corresponds to the order of the analysis variables in the VAR statement.
Interaction: | PROC UNIVARIATE creates a variable name by combining the PCTLPRE= value and either suffix-name or (if you omit PCTLNAME= or if you specify too few suffix-name(s)) the PCTLPTS= value. |
- PCTLNAME=suffix-name(s)
- specifies one or more suffixes to create the names for the variables that contain the PCTLPTS= percentiles. PROC UNIVARIATE creates a variable name by combining the PCTLPRE= value and suffix-name. Because the suffix names are associated with the percentiles that are requested, list the suffix names in the same order as the PCTLPTS= percentiles.
Requirement: | You must specify PCTLPRE= to supply prefix names for the variables that contain the PCTLPTS= percentiles. |
Interaction: | If the number of PCTLNAME= values is fewer than the number of percentile(s) or if you omit PCTLNAME=, PROC UNIVARIATE usespercentile as the suffix to create the name of the variable that contains the percentile. For an integer percentile, PROC UNIVARIATE uses percentile. For a noninteger percentile, PROC UNIVARIATE truncates decimal values of percentile to two decimal places and replaces the decimal point with an underscore. |
Interaction: | If either the prefix and suffix name combination or the prefix and percentile name combination is longer than 32 characters, PROC UNIVARIATE truncates the prefix name so that the variable name is 32 characters. |
You can use PCTLPTS= to output percentiles that are not in the list of quantile statistics. PROC UNIVARIATE computes the requested percentiles based on the method that you specify with the PCTLDEF= option in the PROC UNIVARIATE statement. You must use PCTLPRE=, and optionally PCTLNAME=, to specify variable names for the percentiles. For example, the following statements create an output data set that is named PCTLS that contains the 20th and 40th percentiles of the analysis variables Test1 and Test2:
proc univariate data=score;
var Test1 Test2;
output out=pctls pctlpts=20 40 pctlpre=Test1_ Test2_
pctlname=P20 P40;
run;
PROC UNIVARIATE saves the 20th and 40th percentiles for Test1 and Test2 in the variables Test1_P20, Test2_P20, Test1_P40, and Test2_P40.
When you use a BY statement, the number of observations in the OUT= data set corresponds to the number of BY groups. Otherwise, the OUT= data set contains only one observation.