The ICPHREG Procedure

BASELINE Statement

  • BASELINE <OUT=SAS-data-set> <COVARIATES=SAS-data-set> <TIMELIST=list> <keyword=name …keyword=name> </ options>;

The BASELINE statement creates a SAS data set (named by the OUT= option) that contains the predicted values at specified times that partition the time axis for every set of covariates in the COVARIATES= data set. If the COVARIATES= data set is not specified, PROC ICPHREG uses a reference set of covariates that consists of the reference levels for the CLASS variables and the average values for the continuous variables.

Table 51.3 summarizes the options that you can specify in the BASELINE statement.

Table 51.3: BASELINE Statement Options

Option

Description

Data Set and Time List Options

OUT=

Specifies the output BASELINE data set

COVARIATES=

Specifies the SAS data set that contains the explanatory variables

TIMELIST=

Specifies a list of time points for computing the predicted values

Keyword Options for Variables

CUMHAZ=

Specifies the cumulative hazard function estimate

LOGLOGS=

Specifies the log of the negative log of the survival function

LOGSURV=

Specifies the log of the survival function

LOWERCUMHAZ=

Specifies the lower pointwise confidence limit for the cumulative hazard function

LOWER=

Specifies the lower pointwise confidence limit for the survival function

STDCUMHAZ=

Specifies the estimated standard error of the cumulative hazard function

STDERR=

Specifies the standard error of the survival function

STDXBETA=

Specifies the estimated standard error of the linear predictor estimator

SURVIVAL=

Specifies the survival function estimate

UPPERCUMHAZ=

Specifies the upper pointwise confidence limit for the cumulative hazard function

UPPER=

Specifies the upper pointwise confidence limit for the survival function

XBETA=

Specifies the estimate of the linear predictor $\mb{z}’\bbeta $

Other Options

ALPHA=

Specifies the level of the confidence interval for the survival function

CLTYPE=

Specifies the transformation that is used to compute confidence limits for the survival function

GROUP=

Names a variable whose values identify or group predicted survival or cumulative hazard functions in plots

ROWID=

Names the variable in the COVARIATES= data set for identifying the predicted survival or cumulative hazard functions in plots


You can specify the following options in the BASELINE statement.

OUT=SAS-data-set

names the output data set. If you omit the OUT= option, the data set is created and given a default name by using the DATAn convention. For more information, see the section OUT= Output Data Set in the BASELINE Statement.

COVARIATES=SAS-data-set

names the SAS data set that contains the sets of explanatory variable values for which the functions of interest are estimated. All variables in the COVARIATES= data set are copied to the OUT= data set. Thus, any variable in the COVARIATES= data set can be used to identify the covariate sets in the OUT= data set.

TIMELIST=list

specifies a list of time points at which the predicted values are computed. The following specifications are equivalent:

   timelist=5,20 to 50 by 10
   timelist=5 20 30 40 50

If you do not specify this option, predicted values are computed at all the times that partition the time axis.

keyword=name

specifies the statistics to be included in the OUT= data set and assigns names to the variables that contain these statistics. Specify a keyword for each desired statistic, an equal sign, and the name of the variable for the statistic. You can specify the following keywords:

CUMHAZ=name

specifies the cumulative hazard function estimate. Specifying CUMHAZ=_ALL_ is equivalent to specifying CUMHAZ=CumHaz, STDCUMHAZ=StdErrCumHaz, LOWERCUMHAZ=LowerCumHaz, and UPPERCUMHAZ=UpperCumHaz.

LOGLOGS=name
CLOGLOGS=name

specifies the log of the negative log of the estimated survival function.

LOGSURV=name

specifies the log of the estimated survival function.

LOWER=name
L=name
LOWERSDF=name

specifies the lower pointwise confidence limit for the survival function. The confidence level is determined by the ALPHA= option.

LOWERCUMHAZ=name

specifies the lower pointwise confidence limit for the cumulative hazard function. The confidence level is determined by the ALPHA= option.

STDERR=name
STDSDF=name

specifies the standard error of the survival function estimator.

STDCUMHAZ=name

specifies the estimated standard error of the cumulative hazard function estimator.

STDXBETA=name

specifies the estimated standard error of the linear predictor estimator.

SURVIVAL=name
SDF=name

specifies the estimated survival function ($S(t)=[S_0(t)]^{\exp (\mb{z}'\bbeta )}$). Specifying SURVIVAL=_ALL_ is equivalent to specifying SURVIVAL=Survival, STDERR=StdErrSurvival, LOWER=LowerSurvival, and UPPER=UpperSurvival.

UPPER=name
U=name
UPPERSDF=name

specifies the upper pointwise confidence limit for the survival function. The confidence level is determined by the ALPHA= option.

UPPERCUMHAZ=name

specifies the upper pointwise confidence limit for the cumulative hazard function. The confidence level is determined by the ALPHA= option.

XBETA=name

specifies the estimate of the linear predictor $\mb{z}’\bbeta $. If there is an offset, it is added to the predictor.

You can specify the following options after a slash (/).

ALPHA=value

specifies the level of the confidence interval for the survival function. The value must be between 0 and 1. The default is the value of the ALPHA= option in the PROC ICPHREG statement, or 0.05 if that option is not specified.

CLTYPE=method
CITYPE=method
TYPE=method
CLTRANSFORM=method
TRANSFORM=method

specifies the transformation that is used to compute the confidence limits for $S(t,\mb{z})$, which is the survival function for a subject that has a fixed covariate vector $\mb{z}$ at event time t. You can specify the following methods:

LOG

uses normal theory approximation to compute the confidence limits for $\log (S(t,\mb{z}))$. The confidence limits for $S(t,\mb{z})$ are obtained by back-transforming the confidence limits for $\log (S(t,\mb{z}))$.

LOGLOG

uses normal theory approximation to compute the confidence limits for the $\log (-\log (S(t,\mb{z})))$. The confidence limits for $S(t,\mb{z})$ are obtained by back-transforming the confidence limits for $\log (-\log (S(t,\mb{z})))$.

NORMAL
IDENTITY
LINEAR
PLAIN
DIRECT

uses normal theory approximation to compute the confidence limits for $S(t,\mb{z})$.

By default, CLTYPE=LOG.

GROUP=variable

names a variable whose values identify or group the predicted curves. The variable must be a numeric variable in the COVARIATES= data set. Survival curves for observations that have the same value of the variable are overlaid in the same plot.

ROWID=variable
ID=variable
ROW=variable

names a variable in the COVARIATES= data set for identifying plotted survival functions and cumulative hazard functions. This option has no effect if the PLOTS= option in the PROC ICPHREG statement is not specified. Values of this variable are used to label the plotted functions for the corresponding rows in the COVARIATES= data set. You can specify ROWID=_OBS_ to use the observation numbers in the COVARIATES= data set for identification.