The POWER Procedure

LOGISTIC Statement

Subsections:

Summary of Options
Dictionary of Options
Restrictions on Option Combinations
Option Groups for Common Analyses

LOGISTIC <options> ;

The LOGISTIC statement performs power and sample size analyses for the likelihood ratio chi-square test of a single predictor in binary logistic regression, possibly in the presence of one or more covariates that might be correlated with the tested predictor.

Summary of Options

Table 75.2 summarizes the options available in the LOGISTIC statement.

Table 75.2: LOGISTIC Statement Options

Option	Description
Define analysis
TEST=	Specifies the statistical analysis
Specify analysis information
ALPHA=	Specifies the significance level
COVARIATES=	Specifies the distributions of predictor variables
TESTPREDICTOR=	Specifies the distribution of the predictor variable being tested
VARDIST=	Defines a distribution for a predictor variable
Specify effects
CORR=	Specifies the multiple correlation between the predictor and the covariates
COVODDSRATIOS=	Specifies the odds ratios for the covariates
COVREGCOEFFS=	Specifies the regression coefficients for the covariates
DEFAULTUNIT=	Specifies the default change in the predictor variables
INTERCEPT=	Specifies the intercept
RESPONSEPROB=	Specifies the response probability
TESTODDSRATIO=	Specifies the odds ratio being tested
TESTREGCOEFF=	Specifies the regression coefficient for the predictor variable
UNITS=	Specifies the changes in the predictor variables
Specify sample size
NFRACTIONAL	Enables fractional input and output for sample sizes
NTOTAL=	Specifies the sample size
Specify power
POWER=	Specifies the desired power of the test
Specify computational method
DEFAULTNBINS=	Specifies the default number of categories for each predictor variable
NBINS=	Specifies the number of categories for predictor variables
Control ordering in output
OUTPUTORDER=	Controls the output order of parameters

Table 75.3 summarizes the valid result parameters in the LOGISTIC statement.

Table 75.3: Summary of Result Parameters in the LOGISTIC Statement

Analyses	Solve For	Syntax
TEST=LRCHI	Power	POWER=.
	Sample size	NTOTAL=.

Dictionary of Options

ALPHA=number-list

specifies the level of significance of the statistical test. The default is 0.05, corresponding to the usual 0.05 $\times$ 100% = 5% level of significance. For information about specifying the number-list, see the section Specifying Value Lists in Analysis Statements.

CORR=number-list

specifies the multiple correlation ( $\rho$ ) between the tested predictor and the covariates. If you also specify the COVARIATES= option, then the sample size is either multiplied (if you are computing power) or divided (if you are computing sample size) by a factor of $(1 - \rho ^2)$ . For information about specifying the number-list, see the section Specifying Value Lists in Analysis Statements.

COVARIATES=grouped-name-list

specifies the distributions of any predictor variables in the model but not being tested, using labels specified with the VARDIST= option. The distributions are assumed to be independent of each other and of the tested predictor. If this option is omitted, then the tested predictor specified by the TESTEDPREDICTOR= option is assumed to be the only predictor in the model. For information about specifying the grouped-name-list, see the section Specifying Value Lists in Analysis Statements.

COVODDSRATIOS=grouped-number-list

specifies the odds ratios for the covariates in the full model (including variables in the TESTPREDICTOR= and COVARIATES= options). The ordering of the values corresponds to the ordering in the COVARIATES= option. If the response variable is coded as Y = 1 for success and Y = 0 for failure, then the odds ratio for each covariate X is the odds of Y = 1 when $X = a$ divided by the odds of Y = 1 when $X = b$ , where a and b are determined from the DEFAULTUNIT= and UNITS= options. Values must be greater than zero. For information about specifying the grouped-number-list, see the section Specifying Value Lists in Analysis Statements.

COVREGCOEFFS=grouped-number-list

specifies the regression coefficients for the covariates in the full model including the test predictor (as specified by the TESTPREDICTOR= option). The ordering of the values corresponds to the ordering in the COVARIATES= option. For information about specifying the grouped-number-list, see the section Specifying Value Lists in Analysis Statements.

DEFAULTNBINS=number

specifies the default number of categories (or “bins”) into which the distribution for each predictor variable is divided in internal calculations. Higher values increase computational time and memory requirements but generally lead to more accurate results. Each test predictor or covariate that is absent from the NBINS= option derives its bin number from the DEFAULTNBINS= option. The default value is DEFAULTNBINS=10.

There are two variable distributions for which the number of bins can be overridden internally:

For an ordinal distribution, the number of ordinal values is always used as the number of bins.
For a binomial distribution, if the requested number of bins is larger than n + 1, where n is the sample size parameter of the binomial distribution, then exactly n + 1 bins are used.

DEFAULTUNIT=change-spec

specifies the default change in the predictor variables assumed for odds ratios specified with the COVODDSRATIOS= and TESTODDSRATIO= options. Each test predictor or covariate that is absent from the UNITS= option derives its change value from the DEFAULTUNIT= option. The value must be nonzero. The default value is DEFAULTUNIT=1. This option can be used only if at least one of the COVODDSRATIOS= and TESTODDSRATIO= options is used.

Valid specifications for change-spec are as follows:

number: defines the odds ratio as the ratio of the response variable odds when $X = a$ to the odds when $X = a - \mathit{number}$ for any constant a.
<+ | ->SD: defines the odds ratio as the ratio of the odds when $X = a$ to the odds when $X = a - \sigma$ (or $X = a + \sigma$ , if SD is preceded by a minus sign (–)) for any constant a, where $\sigma$ is the standard deviation of X (as determined from the VARDIST= option).
multiple*SD: defines the odds ratio as the ratio of the odds when $X = a$ to the odds when $X = a - \mathit{multiple} * \sigma$ for any constant a, where $\sigma$ is the standard deviation of X (as determined from the VARDIST= option).
PERCENTILES(p1, p2): defines the odds ratio as the ratio of the odds when X is equal to its $p2*100$ th percentile to the odds when X is equal to its $p1*100$ th percentile (where the percentiles are determined from the distribution specified in the VARDIST= option). Values for p1 and p2 must be strictly between 0 and 1.

INTERCEPT=number-list

specifies the intercept in the full model (including variables in the TESTPREDICTOR= and COVARIATES= options). For information about specifying the number-list, see the section Specifying Value Lists in Analysis Statements.

NBINS=(“name” = number <…"name" = number>)

specifies the number of categories (or “bins”) into which the distribution for each predictor variable (identified by its name from the VARDIST= option) is divided in internal calculations. Higher values increase computational time and memory requirements but generally lead to more accurate results. Each predictor variable that is absent from the NBINS= option derives its bin number from the DEFAULTNBINS= option.

There are two variable distributions for which the NBINS= value can be overridden internally:

For an ordinal distribution, the number of ordinal values is always used as the number of bins.
For a binomial distribution, if the requested number of bins is larger than n + 1, where n is the sample size parameter of the binomial distribution, then exactly n + 1 bins are used.

NFRACTIONAL NFRAC

enables fractional input and output for sample sizes. See the section Sample Size Adjustment Options for information about the ramifications of the presence (and absence) of the NFRACTIONAL option.

NTOTAL=number-list

specifies the sample size or requests a solution for the sample size with a missing value (NTOTAL=.). Values must be at least one. For information about specifying the number-list, see the section Specifying Value Lists in Analysis Statements.

OUTPUTORDER=INTERNAL | REVERSE | SYNTAX

controls how the input and default analysis parameters are ordered in the output. OUTPUTORDER=INTERNAL (the default) arranges the parameters in the output according to the following order of their corresponding options:

DEFAULTNBINS=
NBINS=
ALPHA=
RESPONSEPROB=
INTERCEPT=
TESTPREDICTOR=
TESTODDSRATIO=
TESTREGCOEFF=
COVARIATES=
COVODDSRATIOS=
COVREGCOEFFS=
CORR=
NTOTAL=
POWER=

The OUTPUTORDER=SYNTAX option arranges the parameters in the output in the same order in which their corresponding options are specified in the LOGISTIC statement. The OUTPUTORDER=REVERSE option arranges the parameters in the output in the reverse of the order in which their corresponding options are specified in the LOGISTIC statement.

POWER=number-list

specifies the desired power of the test or requests a solution for the power with a missing value (POWER=.). The power is expressed as a probability, a number between 0 and 1, rather than as a percentage. For information about specifying the number-list, see the section Specifying Value Lists in Analysis Statements.

RESPONSEPROB=number-list

specifies the response probability in the full model when all predictor variables (including variables in the TESTPREDICTOR= and COVARIATES= options) are equal to their means. The log odds of this probability are equal to the intercept in the full model where all predictor are centered at their means. If the response variable is coded as Y = 1 for success and Y = 0 for failure, then this probability is equal to the mean of Y in the full model when all Xs are equal to their means. Values must be strictly between zero and one. For information about specifying the number-list, see the section Specifying Value Lists in Analysis Statements.

TEST=LRCHI

specifies the likelihood ratio chi-square test of a single model parameter in binary logistic regression. This is the default test option.

TESTODDSRATIO=number-list

specifies the odds ratio for the predictor variable being tested in the full model (including variables in the TESTPREDICTOR= and COVARIATES= options). If the response variable is coded as Y = 1 for success and Y = 0 for failure, then the odds ratio for the X being tested is the odds of Y = 1 when $X = a$ divided by the odds of Y = 1 when $X = b$ , where a and b are determined from the DEFAULTUNIT= and UNITS= options. Values must be greater than zero. For information about specifying the number-list, see the section Specifying Value Lists in Analysis Statements.

TESTPREDICTOR=name-list

specifies the distribution of the predictor variable being tested, using labels specified with the VARDIST= option. This distribution is assumed to be independent of the distributions of the covariates as defined in the COVARIATES= option. For information about specifying the name-list, see the section Specifying Value Lists in Analysis Statements.

TESTREGCOEFF=number-list

specifies the regression coefficient for the predictor variable being tested in the full model including the covariates specified by the COVARIATES= option. For information about specifying the number-list, see the section Specifying Value Lists in Analysis Statements.

UNITS=(“name” = change-spec <…"name" = change-spec>)

specifies the changes in the predictor variables assumed for odds ratios specified with the COVODDSRATIOS= and TESTODDSRATIO= options. Each predictor variable whose name (from the VARDIST= option) is absent from the UNITS option derives its change value from the DEFAULTUNIT= option. This option can be used only if at least one of the COVODDSRATIOS= and TESTODDSRATIO= options is used.

Valid specifications for change-spec are as follows:

number: defines the odds ratio as the ratio of the response variable odds when $X = a$ to the odds when $X = a - \mathit{number}$ for any constant a.
<+ | ->SD: defines the odds ratio as the ratio of the odds when $X = a$ to the odds when $X = a - \sigma$ (or $X = a + \sigma$ , if SD is preceded by a minus sign (–)) for any constant a, where $\sigma$ is the standard deviation of X (as determined from the VARDIST= option).
multiple*SD: defines the odds ratio as the ratio of the odds when $X = a$ to the odds when $X = a - \mathit{multiple} * \sigma$ for any constant a, where $\sigma$ is the standard deviation of X (as determined from the VARDIST= option).
PERCENTILES(p1, p2): defines the odds ratio as the ratio of the odds when X is equal to its $p2*100$ th percentile to the odds when X is equal to its $p1*100$ th percentile (where the percentiles are determined from the distribution specified in the VARDIST= option). Values for p1 and p2 must be strictly between 0 and 1.

Each unit value must be nonzero.

VARDIST("label")=distribution (parameters)

defines a distribution for a predictor variable.

For the VARDIST= option,

label: identifies the variable distribution in the output and with the COVARIATES= and TESTPREDICTOR= options.
distribution: specifies the distributional form of the variable.
parameters: specifies one or more parameters associated with the distribution.

The distributions and parameters are named and defined in the same way as the distributions and arguments in the CDF SAS function; for more information, see SAS Language Reference: Dictionary. Choices for distributional forms and their parameters are as follows:

ORDINAL ((values) : (probabilities)): is an ordered categorical distribution. The values are any numbers separated by spaces. The probabilities are numbers between 0 and 1 (inclusive) separated by spaces. Their sum must be exactly 1. The number of probabilities must match the number of values.
BETA (a, b <, l, r >): is a beta distribution with shape parameters a and b and optional location parameters l and r. The values of a and b must be greater than 0, and l must be less than r. The default values for l and r are 0 and 1, respectively.
BINOMIAL (p, n): is a binomial distribution with probability of success p and number of independent Bernoulli trials n. The value of p must be greater than 0 and less than 1, and n must be an integer greater than 0. If n = 1, then the distribution is binary.
EXPONENTIAL ( $\lambda$ ): is an exponential distribution with scale $\lambda$ , which must be greater than 0.
GAMMA (a, $\lambda$ ): is a gamma distribution with shape a and scale $\lambda$ . The values of a and $\lambda$ must be greater than 0.
LAPLACE ( $\theta$ , $\lambda$ ): is a Laplace distribution with location $\theta$ and scale $\lambda$ . The value of $\lambda$ must be greater than 0.
LOGISTIC ( $\theta$ , $\lambda$ ): is a logistic distribution with location $\theta$ and scale $\lambda$ . The value of $\lambda$ must be greater than 0.
LOGNORMAL ( $\theta$ , $\lambda$ ): is a lognormal distribution with location $\theta$ and scale $\lambda$ . The value of $\lambda$ must be greater than 0.
NORMAL ( $\theta$ , $\lambda$ ): is a normal distribution with mean $\theta$ and standard deviation $\lambda$ . The value of $\lambda$ must be greater than 0.
POISSON (m): is a Poisson distribution with mean m. The value of m must be greater than 0.
UNIFORM (l, r): is a uniform distribution on the interval $[$ l, r $]$ , where l $<$ r.

Restrictions on Option Combinations

To specify the intercept in the full model, choose one of the following two parameterizations:

intercept (using the INTERCEPT= options)
Prob(Y = 1) when all predictors are equal to their means (using the RESPONSEPROB= option)

To specify the effect associated with the predictor variable being tested, choose one of the following two parameterizations:

odds ratio (using the TESTODDSRATIO= options)
regression coefficient (using the TESTREGCOEFFS= option)

To describe the effects of the covariates in the full model, choose one of the following two parameterizations:

odds ratios (using the COVODDSRATIOS= options)
regression coefficients (using the COVREGCOEFFS= options)

Option Groups for Common Analyses

This section summarizes the syntax for the common analyses supported in the LOGISTIC statement.

Likelihood Ratio Chi-Square Test for One Predictor

You can express effects in terms of response probability and odds ratios, as in the following statements:

proc power;
   logistic
      vardist("x1a") = normal(0, 2)
      vardist("x1b") = normal(0, 3)
      vardist("x2") = poisson(7)
      vardist("x3a") = ordinal((-5 0 5) : (.3 .4 .3))
      vardist("x3b") = ordinal((-5 0 5) : (.4 .3 .3))
      testpredictor = "x1a" "x1b"
      covariates = "x2" | "x3a" "x3b"
      responseprob = 0.15
      testoddsratio = 1.75
      covoddsratios = (2.1 1.4)
      ntotal = 100
      power = .;
run;

The VARDIST= options define the distributions of the predictor variables. The TESTPREDICTOR= option specifies two scenarios for the test predictor distribution, Normal(10,2) and Normal(10,3). The COVARIATES= option specifies two covariates, the first with a Poisson(7) distribution. The second covariate has an ordinal distribution on the values –5, 0, and 5 with two scenarios for the associated probabilities: (.3, .4, .3) and (.4, .3, .3). The response probability in the full model with all variables equal to zero is specified by the RESPONSEPROB= option as 0.15. The odds ratio for a unit decrease in the tested predictor is specified by the TESTODDSRATIO= option to be 1.75. Corresponding odds ratios for the two covariates in the full model are specified by the COVODDSRATIOS= option to be 2.1 and 1.4. The POWER=. option requests a solution for the power at a sample size of 100 as specified by the NTOTAL= option.

Default values of the TEST= and ALPHA= options specify a likelihood ratio test of the first predictor with a significance level of 0.05. The default of DEFAULTUNIT=1 specifies that all odds ratios are defined in terms of unit changes in predictors. The default of DEFAULTNBINS=10 specifies that each of the three predictor variables is discretized into a distribution with 10 categories in internal calculations.

You can also express effects in terms of regression coefficients, as in the following statements:

proc power;
   logistic
      vardist("x1a") = normal(0, 2)
      vardist("x1b") = normal(0, 3)
      vardist("x2") = poisson(7)
      vardist("x3a") = ordinal((-5 0 5) : (.3 .4 .3))
      vardist("x3b") = ordinal((-5 0 5) : (.4 .3 .3))
      testpredictor = "x1a" "x1b"
      covariates = "x2" | "x3a" "x3b"
      intercept = -6.928162
      testregcoeff = 0.5596158
      covregcoeffs = (0.7419373 0.3364722)
      ntotal = 100
      power = .;
run;

The regression coefficients for the tested predictor (TESTREGCOEFF=0.5596158) and covariates (COVREGCOEFFS=(0.7419373 0.3364722)) are determined by taking the logarithm of the corresponding odds ratios. The intercept in the full model is specified as –6.928162 by the INTERCEPT= option. This number is calculated according to the formula at the end of Analyses in the LOGISTIC Statement, which expresses the intercept in terms of the response probability, regression coefficients, and predictor means:

$\mbox{Intercept} = \log \left(\frac{0.15}{1-0.15}\right) - \left(0.5596158(0) + 0.7419373(7) + 0.3364722(0) \right)$