The HPCDM Procedure(Experimental)

OUTSUM Statement

  • OUTSUM OUT=SAS-data-set statistic-keyword<=variable-name> <... statistic-keyword<=variable-name>> <outsum-options>;

The OUTSUM statement enables you to specify the data set in which PROC HPCDM writes the summary statistics of the compound distribution samples.

If you specify more than one OUTSUM statement, only the first one is used.

You must specify the output data set by using the following option:

OUT=SAS-data-set
OUTSUM=SAS-data-set

specifies the output data set that contains the summary statistics of each of the simulated compound distribution samples. You can control the summary statistics that appear in this data set by specifying different statistic-keywords and outsum-options.

If you execute the HPCDM procedure in distributed mode, only the client-data (local-data) and through-the-client data access modes are supported for this data set. In other words, the libref that you specify for this data set should not point to a distributed database appliance. For more information about data access modes, see the section Data Access Modes of Chapter 3: Shared Concepts and Topics.

You can request that one or more predefined statistics of the compound distribution sample be written to the OUTSUM= data set. For each specification of the form statistic-keyword<=variable-name>, the statistic that is specified by the statistic-keyword is written to a variable named variable-name. If you do not specify the variable-name, then the statistic is written to a variable named statistic-keyword. You can specify the following statistic-keywords:

KURTOSIS
KURT

specifies the kurtosis of the compound distribution sample.

MEAN

specifies the mean of the compound distribution sample.

MEDIAN
Q2
P50

specifies the median (the 50th percentile) of the compound distribution sample.

P01

specifies the 1st percentile of the compound distribution sample.

P05

specifies the 5th percentile of the compound distribution sample.

P95

specifies the 95th percentile of the compound distribution sample.

P99

specifies the 99th percentile of the compound distribution sample.

P99_5
P995

specifies the 99.5th percentile of the compound distribution sample.

Q1
P25

specifies the lower or 1st quartile (the 25th percentile) of the compound distribution sample.

Q3
P75

specifies the upper or 3rd quartile (the 75th percentile) of the compound distribution sample.

QRANGE

specifies the interquartile range (Q3–Q1) of the compound distribution sample.

SKEWNESS
SKEW

specifies the skewness of the compound distribution sample.

STDDEV
STD

specifies the standard deviation of the compound distribution sample.

All percentiles are computed by using the method that you specify for the PCTLDEF= option in the PROC HPCDM statement. You can also request additional percentiles to be reported in the OUTSUM= data set by specifying the following outsum-options:

PCTLPTS=percentile-list

specifies one or more percentiles that you want to be computed and written to the OUTSUM= data set. This option is useful if you need to request percentiles that are not available in the preceding list of statistic-keyword values. Each percentile value must belong to the (0,100) open interval. The percentile-list is a comma-separated list of numbers. You can also use a list notation of the form "<number1> to <number2> by <increment>". For example, the following two options are equivalent:

pctlpts=10, 20, 99.6, 99.7, 99.8, 99.9
pctlpts=10, 20, 99.6 to 99.9 by 0.1

The name of the variable for a given percentile value is decided by the PCTLNAME= option.

PCTLNAME=percentile-variable-name-list

specifies the names of the variables that contain the estimates of the percentiles that you request by using the PCTLPTS= option.

If you do not specify the PCTLNAME= option, then each percentile value t in the list of values in the PCTLPTS= option is written to the variable named "Pt," where the decimal point in t, if any, is replaced by an underscore.

The percentile-variable-name-list is a space-separated list of names. You can also use a shortcut notation of <prefix>m–<prefix>n for two integers m and n ($m < n$) to generate the following list of names: <prefix>m, <prefix>$m+1$, ..., and <prefix>n. For example, the following two options are equivalent:

pctlname=p1 p2 pc5 pc6 pc7 pc8 pc9 pc10
pctlname=p1 p2 pc5-pc10

The name in jth position of the expanded name list of the PCTLNAME= option is used to create a variable for a percentile value in the jth position of the expanded value list of the PCTLPTS= option. If you specify $k_ n$ names in the PCTLNAME= option and $k_ v$ percentile values in the PCTLPTS= option, and if $k_ n < k_ v$, then the first $k_ n$ percentiles are written to the variables that you specify and the remaining $k_ v-k_ n$ percentiles are written to the variables that have the name of the form Pt, where t is the text representation of the percentile value that is formed by retaining at most PCTLNDEC= digits after the decimal point and replacing the decimal point with an underscore ('_'). For example, assume you specify the options

pctlpts=10, 20, 99.3 to 99.5 by 0.1, 99.995 
pctlname=pten ptwenty ninenine3-ninenine5

Then PROC HPCDM writes the 10th and 20th percentiles to pten and ptwenty variables, respectively; the 99.3rd through 99.5th percentiles to ninenine3, ninenine4, and ninenine5 variables, respectively; and the remaining 99.995th percentile to the P99_995 variable.

If a percentile value in the PCTLPTS= option matches a percentile value implied by one of the predefined percentile statistics and you specify the corresponding statistic-keyword, then the variable name that is implied by the statistic-keyword<=variable-name> specification takes precedence over the name that you specify in the PCTLNAME= option. For example, assume you specify the predefined percentile statistic of P95 as in the OUTSUM statement

outsum out=mypctls p95=ninetyfifth
        pctlpts=95 to 99 by 1 pctlname=pct95-pct99;

Then the 95th percentile is written to the ninetyfifth variable instead of the pct95 variable that the PCTLNAME= option implies.

PCTLNDEC=integer-value

specifies the maximum number of decimal places to use while creating the names of the variables for the percentile values in the PCTLPTS= option. The default value is 3. For example, for a percentile value of 99.9995, PROC HPCDM creates a variable named P99_999 by default, but if you specify PCTLNDEC=4, then the variable is named P99_9995.

The PCTLNDEC= option is used only for percentile values for which you do not specify a name in the PCTLNAME= option.

Note that all variable names in the OUTSUM= data set have a limit of 32 characters. If a name exceeds that limit, then it is truncated to contain only the first 32 characters. For more information about the variables in the OUTSUM= data set, see the section Output Data Sets.