The ICPHREG Procedure

PROC ICPHREG Statement

  • PROC ICPHREG <options>;

The PROC ICPHREG statement invokes the ICPHREG procedure. Table 63.2 summarizes the options available in the PROC ICPHREG statement.

Table 63.2: PROC ICPHREG Statement Options

Option

Description

ALPHA=

Specifies the level for confidence limits

DATA=

Names the SAS data set to be analyzed

ITHISTORY

Displays the iteration history, final gradient, and second derivative matrix

NAMELEN=

Specifies the length of effect names

NLOPTIONS

Specifies optimization parameters for fitting the specified model

NOPRINT

Suppresses all displayed output

NOTHREADS

Requests a single-threaded mode for the computation

PLOTS=

Controls the plots that are produced through ODS Graphics

SINGULAR=

Specifies the singularity tolerance

THREADS=

Specifies the number of threads for the computation


You can specify the following options in the PROC ICPHREG statement.

ALPHA=number

specifies the $\alpha $ level for $100(1-\alpha )$% confidence limits. The number must be between 0 and 1; the default value is 0.05, which results in 95% intervals. This value is used as the default level for confidence limits that are computed by the BASELINE, HAZARDRATIO, and MODEL statements. You can override this default by specifying the ALPHA= option in these statements.

DATA=SAS-data-set

names the SAS data set that contains the data to be analyzed. If you omit this option, the procedure uses the most recently created SAS data set.

ITHISTORY

displays the iteration history for computing maximum likelihood estimates, the final evaluation of the gradient, and the final evaluation of the negative of the second derivative matrix (that is, the negative of the Hessian).

NAMELEN=n

specifies the maximum length of effect names in tables and output data sets to be n characters, where n is a value between 20 and 200. By default, NAMELEN=20.

NLOPTIONS(options)

specifies options for the nonlinear optimization methods that are used for fitting the specified model. You can specify the following options:

ABSCONV=r
ABSTOL=r

specifies an absolute function convergence criterion by which minimization stops when $f(\bpsi ^{(k)}) \leq r $, where $\bpsi $ is the vector of parameters in the optimization and $f(\cdot )$ is the objective function. The default value of r is the negative square root of the largest double-precision value, which serves only as a protection against overflows.

ABSFCONV=r
ABSFTOL=r

specifies an absolute function difference convergence criterion. For all techniques except NMSIMP, termination requires a small change of the function value in successive iterations,

\[ |f(\bpsi ^{(k-1)}) - f(\bpsi ^{(k)})| \leq r \]

where $\bpsi $ denotes the vector of parameters that participate in the optimization and $f(\cdot )$ is the objective function. The same formula is used for the NMSIMP technique, but $\bpsi ^{(k)}$ is defined as the vertex that has the lowest function value, and $\bpsi ^{(k-1)}$ is defined as the vertex that has the highest function value in the simplex. By default, ABSFCONV=0.

ABSGCONV=r
ABSGTOL=r

specifies an absolute gradient convergence criterion. Termination requires the maximum absolute gradient element to be small,

\[ \max _ j |g_ j(\bpsi ^{(k)})| \leq r \]

where $\bpsi $ denotes the vector of parameters that participate in the optimization and $g_ j(\cdot )$ is the gradient of the objective function with respect to the jth parameter. This criterion is not used by the NMSIMP technique. The default value is r = 1E–5.

FCONV=r
FTOL=r

specifies a relative function convergence criterion. For all techniques except NMSIMP, termination requires a small relative change of the function value in successive iterations,

\[ \frac{|f(\bpsi ^{(k)}) - f(\bpsi ^{(k-1)})|}{|f(\bpsi ^{(k-1)})|} \leq r \]

where $\bpsi $ denotes the vector of parameters that participate in the optimization and $f(\cdot )$ is the objective function. The same formula is used for the NMSIMP technique, but $\bpsi ^{(k)}$ is defined as the vertex that has the lowest function value, and $\bpsi ^{(k-1)}$ is defined as the vertex that has the highest function value in the simplex. The default is r $=10^{-\mr{FDIGITS}}$, where FDIGITS is by default $-\log _{10}\{ \epsilon \} $ and $\epsilon $ is the machine precision.

GCONV=r
GTOL=r

specifies a relative gradient convergence criterion. For all techniques except CONGRA and NMSIMP, termination requires the normalized predicted function reduction to be small,

\[ \frac{\mb{g}(\bpsi ^{(k)})^\prime [\bH ^{(k)}]^{-1} \mb{g}(\bpsi ^{(k)})}{|f(\bpsi ^{(k)})| } \leq r \]

where $\bpsi $ denotes the vector of parameters that participate in the optimization, $f(\cdot )$ is the objective function, and $\mb{g}(\cdot )$ is the gradient. For the CONGRA technique (in which a reliable Hessian estimate $\bH $ is not available), the following criterion is used:

\[ \frac{\parallel \mb{g}(\bpsi ^{(k)}) \parallel _2^2 \quad \parallel \mb{g}(\bpsi ^{(k)}) \parallel _2}{\parallel \mb{g}(\bpsi ^{(k)}) - \mb{g}(\bpsi ^{(k-1)}) \parallel _2 |f(\bpsi ^{(k)})| } \leq r \]

This criterion is not used by the NMSIMP technique. The default value is r = 1E–8.

MAXFUNC=n
MAXFU=n

specifies the maximum number of function calls in the optimization process. The default values are as follows, depending on the optimization technique (which you can specify in the TECHNIQUE= option):

  • TRUREG, NRRIDG, and NEWRAP: 125

  • QUANEW and DBLDOG: 500

  • CONGRA: 1000

  • NMSIMP: 3000

The optimization can terminate only after completing a full iteration. Therefore, the number of function calls that are actually performed can exceed n.

MAXITER=n
MAXIT=n

specifies the maximum number of iterations in the optimization process. The default values are as follows, depending on the optimization technique (which you can specify in the TECHNIQUE= option):

  • TRUREG, NRRIDG, and NEWRAP: 50

  • QUANEW and DBLDOG: 200

  • CONGRA: 400

  • NMSIMP: 1000

These default values also apply when n is specified as a missing value.

MAXTIME=r

specifies an upper limit of r seconds of CPU time for the optimization process. The time is checked only at the end of each iteration. Therefore, the actual run time might be longer than r. By default, CPU time is not limited.

MINITER=n
MINIT=n

specifies the minimum number of iterations. If you request more iterations than are actually needed for convergence to a stationary point, the optimization algorithms can behave strangely. For example, the effect of rounding errors can prevent the algorithm from continuing for the required number of iterations. By default, MINITER=0.

TECHNIQUE=keyword

specifies the optimization technique to obtain maximum likelihood estimates. You can choose from the following techniques:

CONGRA

performs a conjugate-gradient optimization.

DBLDOG

performs a version of double-dogleg optimization.

NEWRAP

performs a Newton-Raphson optimization that combines a line-search algorithm with ridging.

NMSIMP

performs a Nelder-Mead simplex optimization.

NONE

performs no optimization.

NRRIDG

performs a Newton-Raphson optimization with ridging.

QUANEW

performs a dual quasi-Newton optimization.

TRUREG

performs a trust-region optimization.

By default, TECHNIQUE=NEWRAP.

For more information about these optimization methods, see the section Choosing an Optimization Algorithm in Chapter 19: Shared Concepts and Topics.

NOPRINT

suppresses all displayed output. This option temporarily disables the Output Delivery System (ODS); For more information, see Chapter 20: Using the Output Delivery System.

NOTHREADS

forces single-threaded execution of the analytic computations. This option overrides the SAS system option THREADS | NOTHREADS. Specifying this option is equivalent to specifying the THREADS=1 option.

PLOTS<(global-plot-options)> = plot-request
PLOTS<(global-plot-options)> = (plot-request <…<plot-request>>)

specifies plots to be created using ODS Graphics. You can request plots of survival functions and cumulative hazard functions. Also, many of the observation statistics in the output data set can be plotted using this option. You are not required to create an output data set in order to produce a plot. When you specify only one plot-request, you can omit the parentheses around it.

You can specify the following global-plot-options:

CL

displays the pointwise confidence limits for the plot.

OVERLAY <=overlay-option>

specifies how to overlay the functions that are plotted for the covariate sets. You can specify the following overlay-options:

BYGROUP
GROUP

overlays onto the same plot all functions that are plotted for the covariate sets and have the same GROUP= value in the COVARIATES= data set.

INDIVIDUAL
IND

displays a separate plot for each covariate set.

By default, OVERLAY=BYGROUP if the GROUP= option is specified in the BASELINE statement or if the COVARIATES= data set contains the _GROUP_ variable; otherwise, by default, OVERLAY=INDIVIDUAL.

UNPACK

displays multiple plots individually. The default is to display related multiple plots in a panel. The UNPACK option works for INTERVAL, RESDEV, and RESLAG plots only. See the section OUTPUT Statement for definitions of the statistics specified with these plot-requests.

TIMERANGE=(<min> <,max>)
TIMERANGE=<min> <,max>
RANGE=(<min> <,max>)
RANGE=<min> <,max>

specifies the range of values on the time axis to clip the display. The min and max values are the lower and upper bounds of the range. By default, min is 0 and max is the largest boundary value.

You can specify the following plot-requests:

CUMHAZ

plots the estimated cumulative hazard function for each set of covariates in the data set that you specify in the COVARIATES= option in the BASELINE statement. If the COVARIATES= data set is not specified, the estimated cumulative hazard function is plotted for the reference set of covariates, which consists of reference levels for the CLASS variables and average values for the continuous variables.

HAZARD

plots the estimated hazard function for each set of covariates in the data set that you specify in the COVARIATES= option in the BASELINE statement. If the COVARIATES= data set is not specified, the estimated hazard function is plotted for the reference set of covariates, which consists of reference levels for the CLASS variables and average values for the continuous variables.

INTERVAL

plots the observed interval length as a function of observation number.

NONE

suppresses all the plots in the procedure. Specifying this option is equivalent to disabling ODS Graphics for the entire procedure.

RESDEV<(options)>

plots deviance residuals. You can specify the following options:

INDEX

plots deviance residuals as a function of the observation number.

XBETA

plots deviance residuals as a function of the linear predictor.

If you do not specify an option, deviance residuals are plotted as a function of the observation number.

RESLAG<(options)>

plots Lagakos residuals. You can specify the following options:

INDEX

plots Lagakos residuals as a function of the observation number.

XBETA

plots Lagakos residuals as a function of the linear predictor.

If you do not specify an option, deviance residuals are plotted as a function of observation number.

SURVIVAL
S
SURV
SUR

plots the estimated survival function for each set of covariates in the data set that is specified in the COVARIATES= option in the BASELINE statement. If the COVARIATES= data set is not specified, the estimated survival function is plotted for the reference set of covariates, which consists of reference levels for the CLASS variables and average values for the continuous variables.

Each observation in the data set that is specified in the COVARIATES= option in the BASELINE statement provides a set of covariates for which a plot is produced for each plot-request. You can use the ROWID= option in the BASELINE statement to specify a variable in the COVARIATES= data set for identifying the functions that are plotted for the covariate sets. If the ROWID= option is not specified, the plots are identified by the covariate values if there is only a single covariate or by the observation numbers of the COVARIATES= data set if the model has two or more covariates. If the COVARIATES= data set is not specified, a reference set of covariates that consists of the reference levels for the CLASS variables and the average values for the continuous variables is used. When plotting more than one function, you can use the OVERLAY= option to group the functions. When you specify only one plot-request, you can omit the parentheses around the plot request. Here are some examples:

plots=survival
plots=(survival cumhaz)

ODS Graphics must be enabled before plots can be requested. For example:

ods graphics on;
proc icphreg plots(cl)=survival;
   model (Left, Right)=X1-X5;
   baseline covariates=One;
run;

For more information about enabling and disabling ODS Graphics, see the section Enabling and Disabling ODS Graphics in Chapter 21: Statistical Graphics Using ODS.

SINGULAR=number
EPSILON=number

specifies the tolerance for testing the singularity of the $\mb{Z}’\mb{Z}$ matrix that is formed from the design matrix $\mb{Z}$ and for testing the singularity of the Hessian matrix upon convergence of the optimization algorithm. Appropriately, the test requires that a pivot be at least this number times the original diagonal value. By default, number is $10^7$ times the machine epsilon. On most machines, the default number is approximately $10^{-9}$.

THREADS=n
NTHREADS=n

specifies the number of threads for analytic computations and overrides the SAS system option THREADS | NOTHREADS. If you do not specify the THREADS= option or if you specify THREADS=0, the number of threads is determined based on the data size and the number of CPUs on the host on which the analytic computations execute.