The LOGISTIC Procedure

STRATA Statement

STRATA variable <(option)> …<variable <(option)>> </ options> ;

The STRATA statement names the variables that define strata or matched sets to use in stratified logistic regression of binary response data.

Observations that have the same variable values are in the same matched set. For a stratified logistic model, you can analyze $1\colon \  1$, $1\colon \  n$, $m\colon \  n$, and general $m_ i\colon \  n_ i$ matched sets where the number of cases and controls varies across strata. At least one variable must be specified to invoke the stratified analysis, and the usual unconditional asymptotic analysis is not performed. The stratified logistic model has the form

\[  \mbox{logit}(\pi _{hi})= \alpha _{h} + \mb {x}_{hi}’\bbeta  \]

where $\pi _{hi}$ is the event probability for the ith observation in stratum h with covariates $\mb {x}_{hi}$ and where the stratum-specific intercepts $\alpha _{h}$ are the nuisance parameters that are to be conditioned out.

STRATA variables can also be specified in the MODEL statement as classification or continuous covariates; however, the effects are nondegenerate only when crossed with a nonstratification variable. Specifying several STRATA statements is the same as specifying one STRATA statement that contains all the strata variables. The STRATA variables can be either character or numeric, and the formatted values of the STRATA variables determine the levels. Thus, you can also use formats to group values into levels; see the discussion of the FORMAT procedure in the Base SAS Procedures Guide.

The Strata Summary table is displayed by default. For an exact logistic regression, it displays the number of strata that have a specific number of events and non-events. For example, if you are analyzing a $1\colon \  5$ matched study, this table enables you to verify that every stratum in the analysis has exactly one event and five non-events. Strata that contain only events or only non-events are reported in this table, but such strata are uninformative and are not used in the analysis.

If an EXACT statement is also specified, then a stratified exact logistic regression is performed.

The EFFECTPLOT, SCORE, and WEIGHT statements are not available with a STRATA statement. The following MODEL options are also not supported with a STRATA statement: CLPARM=PL, CLODDS=PL, CTABLE, FIRTH, LACKFIT, LINK=, NOFIT, OUTMODEL=, OUTROC=, ROC, and SCALE=.

The following option can be specified for a stratification variable by enclosing the option in parentheses after the variable name, or it can be specified globally for all STRATA variables after a slash (/).


treats missing values ('.', '._', '.A', …, '.Z' for numeric variables and blanks for character variables) as valid STRATA variable values.

The following strata options are also available after the slash:


specifies which variables are to be tested for dependency before the analysis is performed. The available keywords are as follows:


performs no dependence checking. Typically, a message about a singular information matrix is displayed if you have dependent variables. Dependent variables can be identified after the analysis by noting any missing parameter estimates.


checks dependence between covariates and an added intercept. Dependent covariates are removed from the analysis. However, covariates that are linear functions of the strata variable might not be removed, which results in a singular information matrix message being displayed in the SAS log. This is the default.


checks dependence between all the strata and covariates. This option can adversely affect performance if you have a large number of strata.


suppresses the display of the Strata Summary table.


displays the Strata Information table, which includes the stratum number, levels of the STRATA variables that define the stratum, the number of events, the number of non-events, and the total frequency for each stratum. Since the number of strata can be very large, this table is displayed only by request.