# The SURVEYFREQ Procedure

### Displayed Output

Subsections:

#### Data Summary Table

The "Data Summary" table provides information about the input data set and the sample design. PROC SURVEYFREQ displays this table unless you specify the NOSUMMARY option in the PROC SURVEYFREQ statement.

The "Data Summary" table displays the total number of valid observations. To be considered valid, an observation must have a nonmissing, positive sampling weight value if you specify a WEIGHT statement. If you do not specify the MISSING option, a valid observation must also have nonmissing values for all STRATA and CLUSTER variables. The number of valid observations can differ from the number of nonmissing observations for an individual table request, which the procedure displays in the frequency or crosstabulation tables. For more information, see the section Missing Values.

PROC SURVEYFREQ displays the following information in the "Data Summary" table:

• Number of Strata, if you specify a STRATA statement

• Number of Clusters, if you specify a CLUSTER statement

• Number of Observations, which is the total number of valid observations

• Sum of Weights, which is the sum over all valid observations, if you specify a WEIGHT or REPWEIGHTS statement

#### Stratum Information Table

If you specify the LIST option in the STRATA statement, PROC SURVEYFREQ displays a "Stratum Information" table. This table provides the following information for each stratum:

• Stratum Index, which is a sequential stratum identification number

• STRATA variables, which list the levels of STRATA variables for the stratum

• Number of Observations, which is the number of valid observations in the stratum

• Population Total for the stratum, if you specify the TOTAL= option

• Sampling Rate for the stratum, if you specify the TOTAL= or RATE= option. If you specify the TOTAL= option, the sampling rate is based on the number of valid observations in the stratum.

• Number of Clusters, which is the number of clusters in the stratum, if you specify a CLUSTER statement

#### Variance Estimation Table

If you specify the VARMETHOD=BRR , VARMETHOD=JACKKNIFE , or NOMCAR option in the PROC SURVEYFREQ statement, the procedure displays a "Variance Estimation" table. If you do not specify any of these options, the procedure creates a "Variance Estimation" table but does not display it. You can store this nondisplayed table in an output data set by using the Output Delivery System (ODS). For more information, see the section ODS Table Names.

The "Variance Estimation" table provides the following information:

• Method, which is the variance estimation method—Taylor Series, Balanced Repeated Replication, or Jackknife

• Replicate Weights input data set name, if you use a REPWEIGHTS statement to provide replicate weights

• Number of Replicates, if you specify VARMETHOD=BRR or VARMETHOD=JACKKNIFE

• Hadamard Data Set name, if you specify the HADAMARD= method-option for VARMETHOD=BRR

• Fay Coefficient, if you specify the FAY method-option for VARMETHOD=BRR

• Missing Levels Included (MISSING), if you specify the MISSING option

• Missing Levels Included (NOMCAR), if you specify the NOMCAR option

If you specify the PRINTH method-option for VARMETHOD=BRR , PROC SURVEYFREQ displays the Hadamard matrix that it uses to construct replicates for BRR variance estimation. If you provide a Hadamard matrix by specifying the HADAMARD= method-option for VARMETHOD=BRR but the procedure does not use the entire matrix, the procedure displays only the rows and columns that are actually used to construct replicates.

#### One-Way Frequency Tables

PROC SURVEYFREQ displays one-way frequency tables for all one-way table requests in the TABLES statements, unless you specify the NOPRINT option in the TABLES statement. A one-way table shows the sample frequency distribution of a single variable, and provides estimates for its population distribution in terms of totals and proportions.

If you request a one-way table without specifying options, PROC SURVEYFREQ displays the following information for each level of the variable:

• Frequency count, which is the number of sample observations in the level

• Weighted Frequency, which estimates the population total for the level

• Standard Deviation of Weighted Frequency

• Percent, which estimates the population proportion for the level

• Standard Error of Percent

The one-way table displays weighted frequencies if your analysis includes a WEIGHT or REPWEIGHTS statement, or if you specify the WTFREQ option in the TABLES statement.

The one-way table also displays the Frequency Missing, which is the number of observations with missing values.

You can suppress the frequency counts by specifying the NOFREQ option in the TABLES statement. Also, the NOWT option suppresses the weighted frequencies and their standard deviations. The NOPERCENT option suppresses the percentages and their standard errors. The NOSTD option suppresses the standard errors of the percentages and the standard deviations of the weighted frequencies. The NOTOTAL option suppresses the total row of the one-way table.

PROC SURVEYFREQ optionally displays the following information in a one-way table:

• Variance of Weighted Frequency, if you specify the VARWT option

• Confidence Limits for Weighted Frequency, if you specify the CLWT option

• Coefficient of Variation for Weighted Frequency, if you specify the CVWT option

• Test Percent, if you specify the TESTP= option

• Variance of Percent, if you specify the VAR option

• Confidence Limits for Percent, if you specify the CL option

• Coefficient of Variation for Percent, if you specify the CV option

• Design Effect for Percent, if you specify the DEFF option

#### Crosstabulation Tables

PROC SURVEYFREQ displays all table requests in the TABLES statements, unless you specify the NOPRINT option in the TABLES statement. For two-way to multiway crosstabulation tables, the values of the last variable in the table request form the table columns. The values of the next-to-last variable form the rows. Each level (or combination of levels) of the other variables forms one layer. PROC SURVEYFREQ produces a separate two-way crosstabulation table for each layer of a multiway table.

For each layer, the crosstabulation table displays the row and column variable names and values (levels). Each two-way table lists levels of the column variable within each level of the row variable.

By default, PROC SURVEYFREQ displays all levels of the column variable within each level of the row variable, including any column variable levels that have frequencies of 0 in the row level. By default for multiway tables, PROC SURVEYFREQ displays all levels of the row variable within each layer of the table, including any row levels that have frequencies of 0 in the layer. You can suppress the display of zero-frequency levels by specifying the NOSPARSE option.

If you request a crosstabulation table without specifying options, the table displays the following information for each combination of variable levels (table cell):

• Frequency, which is the number of sample observations in the table cell

• Weighted Frequency, which estimates the population total for the table cell

• Standard Deviation of Weighted Frequency

• Percent, which estimates the population proportion for the table cell

• Standard Error of Percent

The two-way table displays weighted frequencies if your analysis includes a WEIGHT or REPWEIGHTS statement, or if you specify the WTFREQ option in the TABLES statement.

The two-way table also displays the Frequency Missing, which is the number of observations with missing values.

You can suppress the frequency counts by specifying the NOFREQ option in the TABLES statement. Also, the NOWT option suppresses the weighted frequencies and their standard deviations. The NOPERCENT option suppresses all percentages and their standard errors. The NOCELLPERCENT option suppresses overall cell percentages and their standard errors, but displays any other percentages (and standard errors) that you request, such as row or column percentages. The NOSTD option suppresses the standard errors of the percentages and the standard deviations of the weighted frequencies. The NOTOTAL option suppresses the row totals, column totals, and overall total.

PROC SURVEYFREQ optionally displays the following information in a two-way table:

• Expected Weighted Frequency, if you specify the EXPECTED option

• Deviation from Expected Weighted Frequency, if you specify the DEVIATION option

• Pearson Residual, if you specify the PEARSONRES option

• Cell Chi-Square, if you specify the CELLCHI2 option

• Variance of Weighted Frequency, if you specify the VARWT option

• Confidence Limits for Weighted Frequency, if you specify the CLWT option

• Coefficient of Variation for Weighted Frequency, if you specify the CVWT option

• Variance of Percent, if you specify the VAR option

• Confidence Limits for Percent, if you specify the CL option

• Coefficient of Variation for Percent, if you specify the CV option

• Design Effect for Percent, if you specify the DEFF option

• Row Percent, which estimates the population proportion of the row total, if you specify the ROW option

• Standard Error of Row Percent, if you specify the ROW option

• Variance of Row Percent, if you specify the VAR option and the ROW option

• Confidence Limits for Row Percent, if you specify the CL option and the ROW option

• Coefficient of Variation for Row Percent, if you specify the CV option and the ROW option

• Design Effect for Row Percent, if you specify the ROW(DEFF) option

• Column Percent, which estimates the population proportion of the column total, if you specify the COLUMN option

• Standard Error of Column Percent, if you specify the COLUMN option

• Variance of Column Percent, if you specify the VAR option and the COLUMN option

• Confidence Limits for Column Percent, if you specify the CL option and the COLUMN option

• Coefficient of Variation for Column Percent, if you specify the CV option and the COLUMN option

• Design Effects for Column Percent, if you specify the COLUMN(DEFF) option

#### Covariance Matrices of Estimates

If you specify the COV option, PROC SURVEYFREQ displays the covariance matrix of the cell total frequency estimates. If you specify the COVP option, PROC SURVEYFREQ displays the covariance matrix of the proportion estimates.

#### Statistical Tests

If you specify the CHISQ option for the Rao-Scott chi-square test or the LRCHISQ option for the Rao-Scott likelihood ratio chi-square test, PROC SURVEYFREQ displays the following information:

• Pearson Chi-Square, if you specify the CHISQ option

• Likelihood Ratio Chi-Square, if you specify the LRCHISQ option

• Design Correction

• Rao-Scott Chi-Square, by default or if you specify the FIRSTORDER option

• First-Order Chi-Square, if you specify the SECONDORDER option

• Second-Order Chi-Square, if you specify the SECONDORDER option

• DF, which is the degrees of freedom for the chi-square test

• Pr > ChiSq, which is the p-value for the chi-square test

• F Value

• Num DF, which is the numerator degrees of freedom for F

• Den DF, which is the denominator degrees of freedom for F

• Pr > F, which is the p-value for the F test

If you specify the WCHISQ option for the Wald chi-square test or the WLLCHISQ option for the Wald log-linear chi-square test, PROC SURVEYFREQ displays the following information:

• Wald Chi-Square, if you specify the WCHISQ option

• Wald Log-Linear Chi-Square, if you specify the WLLCHISQ option

• F Value

• Num DF, which is the numerator degrees of freedom for F

• Den DF, which is the denominator degrees of freedom for F

• Pr > F, which is the p-value for the F test

• Adjusted F Value, for tables larger than

• Num DF, which is the numerator degrees of freedom for Adjusted F

• Den DF, which is the denominator degrees of freedom for Adjusted F

• Pr > Adj F, which is the p-value for the Adjusted F test

#### Risks and Risk Difference

If you specify the RISK option in the TABLES statement for a table, PROC SURVEYFREQ displays "Column 1 Risk Estimates" and "Column 2 Risk Estimates" tables. You can display only column 1 or column 2 risks by specifying the RISK1 or RISK2 option, respectively.

The "Risk Estimates" table displays the following information for Row 1, Row 2, Total, and Difference:

• Row, which identifies the risk as Row 1, Row 2, Total, or Difference

• Risk estimate

• Standard Error

• Confidence Limits

In the "Column 1 Risk Estimates" table, the row 1 risk is the column 1 percentage of row 1. The row 2 risk is the column 1 percentage of row 2, and the total risk is the column 1 percentage of the entire table. The risk difference is the row 1 risk minus the row 2 risk. In the "Column 2 Risk Estimates" table, these computations are based on column 2.

#### Odds Ratio and Relative Risks

If you specify the OR option in the TABLES statement for a table, PROC SURVEYFREQ displays the "Odds Ratio" table. This table includes the following information:

• Statistic, which identifies the statistic as the Odds Ratio, the Column 1 Relative Risk, or the Column 2 Relative Risk

• Estimate

• Confidence Limits

#### Kappa Statistics

If you specify the AGREE , KAPPA , or WTKAPPA option in the TABLES statement for a square table, PROC SURVEYFREQ displays the "Kappa Statistics" table. This table includes the following information:

• Statistic, which identifies the statistic as the Simple Kappa Coefficient or the Weighted Kappa Coefficient

• Estimate

• Standard Error

• Confidence Limits

#### Kappa Weights

If you specify the AGREE(PRINTKWTS) or WTKAPPA(PRINTKWTS) option for a square table whose dimension is greater than 2, PROC SURVEYFREQ displays the "Kappa Weights" table. This table provides the matrix of kappa agreement weights that the procedure uses to compute the weighted kappa coefficient. The matrix contains an agreement weight for each pair of column variable levels.