Displayed Output :: SAS/STAT(R) 14.1 User's Guide

Data and Sample Design Summary

The "Data Summary" table provides information about the input data set and the sample design. This table displays the total number of valid observations, where an observation is considered valid if it has nonmissing values for all procedure variables other than the analysis variables—that is, for all specified STRATA , CLUSTER , DOMAIN , POSTSTRATA , and WEIGHT variables. This number might differ from the number of nonmissing observations for an individual analysis variable, which the procedure displays in the "Statistics" table. See the section Missing Values for more information.

PROC SURVEYMEANS displays the following information in the "Data Summary" table:

Number of Strata, if you specify a STRATA statement
Number of Poststrata, if you specify a POSTSTRATA statement
Number of Clusters, if you specify a CLUSTER statement
Number of Observations, which is the total number of valid observations
Sum of Weights, which is the sum over all valid observations, if you specify a WEIGHT statement

Class Level Information

If you use a CLASS statement to name classification variables for categorical analysis, or if you list any character variables in the VAR statement, then PROC SURVEYMEANS displays a "Class Level Information" table. This table contains the following information for each classification variable:

CLASS Variable, which lists each CLASS variable name
Levels, which is the number of values or levels of the classification variable
Values, which lists the values of the classification variable. The values are separated by a white space character; therefore, to avoid confusion, you should not include a white space character within a classification variable value.

Stratum Information

If you specify the LIST option in the STRATA statement, PROC SURVEYMEANS displays a "Stratum Information" table. This table displays the number of valid observations in each stratum, as well as the number of nonmissing stratum observations for each analysis variable. The "Stratum Information" table provides the following for each stratum:

Stratum Index, which is a sequential stratum identification number
STRATA variable(s), which lists the levels of STRATA variables for the stratum
Population Total, if you specify the TOTAL= option
Sampling Rate, if you specify the TOTAL= or RATE= option. If you specify the TOTAL= option, the sampling rate is based on the number of valid observations in the stratum.
N Obs, which is the number of valid observations
Variable, which lists each analysis variable name
Levels, which identifies each level for categorical variables
N, which is the number of nonmissing observations for the analysis variable
Clusters, which is the number of clusters, if you specify a CLUSTER statement

Variance Estimation

If the variance method is not Taylor series or if the NOMCAR option is used, by default, PROC SURVEYMEANS displays the following variance estimation specifications in the "Variance Estimation" table:

Method, which is the variance estimation method
Replicate Weights Data Set, which is the name of the SAS data set that contains the replicate weights
Number of Replicates, which is the number of replicates if you specify the VARMETHOD=BRR or VARMETHOD=JACKKNIFE option
Hadamard Data Set, which is the name of the SAS data set for the HADAMARD matrix if you specify the VARMETHOD=BRR(HADAMARD=) method-option
Fay Coefficient, which is the value of the FAY coefficient if you specify the VARMETHOD=BRR(FAY) method-option
Missing Levels Included (MISSING), if you specify the MISSING option
Missing Levels Included (NOMCAR), if you specify the NOMCAR option

Statistics

The "Statistics" table displays all of the statistics that you request with statistic-keywords in the PROC SURVEYMEANS statement, except DECILES, MEDIAN, Q1, Q3, and QUARTILES, which are displayed in the "Quantiles" table. If you do not specify any statistic-keywords, then by default this table displays the following information for each analysis variable: the sample size, the mean, the standard error of the mean, and the confidence limits for the mean. The "Statistics" table can contain the following information for each analysis variable, depending on which statistic-keywords you request:

Variable name
Variable Label
Level, which identifies each level for categorical variables
N, which is the number of nonmissing observations
N Miss, which is the number of missing observations
Minimum
Maximum
Range
Number of Clusters
Sum of Weights
DF, which is the degrees of freedom for the t test
Mean
Std Error of Mean, which is the standard error of the mean
Var of Mean, which is the variance of the mean
t Value, for testing $H_0: \mbox{population MEAN} = 0$
Pr $> |~ t~ |$ , which is the two-sided p-value for the t test
$100(1-\alpha )$ % CL for Mean, which are two-sided confidence limits for the mean
$100(1-\alpha )$ % Upper CL for Mean, which is a one-sided upper confidence limit for the mean
$100(1-\alpha )$ % Lower CL for Mean, which is a one-sided lower confidence limit for the mean
Coeff of Variation, which is the coefficient of variation for the mean
Sum
Std Error of Sum, which is the standard error of the sum
Var of Sum, which is the variance of the sum
$100(1-\alpha )$ % CL for Sum, which are two-sided confidence limits for the sum
$100(1-\alpha )$ % Upper CL for Sum, which is a one-sided upper confidence limit for the sum
$100(1-\alpha )$ % Lower CL for Sum, which is a one-sided lower confidence limit for the Sum
Coeff of Variation for sum, which is the coefficient of variation for the sum

Quantiles

The "Quantiles" table displays all the quantiles that you request with either statistic-keywords such as DECILES, MEDIAN, Q1, Q3, and QUARTILES, or the PERCENTILE= option, or the QUANTILE= option in the PROC SURVEYMEANS statement.

The "Quantiles" table contains the following information for each quantile:

Variable name
Variable Label
Percentile, which is the requested quantile in the format of %
Percentile Label, which is the corresponding common name for a percentile if it exists—for example, Median for 50th percentile
Estimate, which is the estimate for a requested quantile with respect to the population distribution
Std Error, which is the standard error of the quantile
$100(1-\alpha )$ % Confidence Limits, which are two-sided confidence limits for the quantile

Domain Analysis

If you specify a DOMAIN statement, the procedure displays domain statistics in a "Domain Analysis" table. A "Domain Analysis" table displays all the requested statistics for each level of the domain request. The procedure produces a separate "Domain Analysis" for each separate domain request. For example, the DOMAIN statement

domain A B*C*D A*C C;

specifies four domain requests:

A: all the levels of A
C: all the levels of C
A*C: all the interactive levels of A and C
B*C*D: all the interactive levels of B, C, and D

The procedure displays four "Domain Analysis" tables, one for each domain definition. If you use an ODS OUTPUT statement to create an output data set for domain analysis, the output data set contains a variable Domain whose values are these domain definitions. It contains all the columns in the "Statistics" table plus columns of domain variable values.

Domain Quantiles

If you specify a DOMAIN statement, and if you request statistics by specifying either statistic-keywords such as DECILES, MEDIAN, Q1, Q3, and QUARTILES, or the PERCENTILE= option, or the QUANTILE= option in the PROC SURVEYMEANS statement, then the procedure displays domain quantiles in a "Domain Quantiles" table. This table displays all the quantile statistics for each level of the domain request. It contains all the columns in the "Quantiles" table plus columns of DOMAIN variable values.

Ratio Analysis

The "Ratio Analysis" table displays statistics for all the ratios that you request in the RATIO statement. If you do not specify any statistic-keywords in the PROC SURVEYMEANS statement, then by default this table displays the ratios and standard errors. The "Ratio Analysis" table can contain the following information for each ratio, depending on which statistic-keywords you request:

Numerator, which identifies the numerator variable of the ratio
Denominator, which identifies the denominator variable of the ratio
N, which is the number of observations used in the ratio analysis
number of Clusters
Sum of Weights
DF, which is the degrees of freedom for the t test
Ratio
Std Err of Ratio, which is the standard error of the ratio
Var, which is the variance of the ratio
t Value, for testing $H_0: \mbox{population RATIO} = 0$
Pr $> |~ t~ |$ , which is the two-sided p-value for the t test
$100(1-\alpha )$ % CL for Ratio, which are two-sided confidence limits for the Ratio
Upper $100(1-\alpha )$ % CL for Ratio, which are one-sided upper confidence limits for the Ratio
Lower $100(1-\alpha )$ % CL for Ratio, which are one-sided lower confidence limits for the Ratio

When you use the ODS OUTPUT statement to create an output data set, if you use labels for your RATIO statement, these labels are saved in the variable Ratio Statement in the output data set.

Domain Ratio Analysis

If you specify a DOMAIN statement with a RATIO statement, the procedure displays domain ratios in a "Domain Ratio Analysis" table. A "Domain Ratio Analysis" table displays all the ratio statistics for each level of the domain request. It contains all the columns in the "Ratio Analysis" table plus columns of domain variable values.

Hadamard Matrix

If you specify the VARMETHOD=BRR(PRINTH) method-option in the PROC SURVEYMEANS statement, PROC SURVEYMEANS displays the Hadamard matrix that is used to construct replicates for BRR variance estimation.

If you provide a Hadamard matrix with the VARMETHOD=BRR(HADAMARD=) method-option but the procedure does not use the entire matrix, the procedure displays only the rows and columns that are actually used to construct replicates.

Geometric Means

The "Geometric Means" table displays all the statistics related to geometric mean that you request with statistic-keywords in the PROC SURVEYMEANS statement. The "Geometric Means" table can contain the following information for each analysis variable, depending on which statistic-keywords you request:

Variable Name
Variable Label
Geometric Mean
Std Error of Geometric Mean
$100(1-\alpha )$ % CL for Geometric Mean, which are two-sided confidence limits for the geometric mean
$100(1-\alpha )$ % Lower CL for Geometric Mean, which is a one-sided lower confidence limit for the geometric mean
$100(1-\alpha )$ % Upper CL for Geometric Mean, which is a one-sided upper confidence limit for the geometric mean

Domain Geometric Means

If you specify a DOMAIN statement and request any statistics related to geometric mean with statistic-keywords in the PROC SURVEYMEANS statement, the procedure displays these statistics for each domain level in a "Domain Geometric Means" table. It contains all the columns in the "Geometric Means" table plus columns of domain variable values.

The SURVEYMEANS Procedure