The HPMIXED Procedure

PROC HPMIXED Statement

  • PROC HPMIXED <options>;

The PROC HPMIXED statement invokes the HPMIXED procedure. Table 55.2 summarizes the options available in the PROC HPMIXED statement. These and other options in the PROC HPMIXED statement are then described fully in alphabetical order.

Table 55.2: PROC HPMIXED Statement Options

Option

Description

Basic Options

DATA=

Specifies input data set

METHOD=

Specifies the estimation method

NOPROFILE

Includes scale parameter in optimization

ORDER=

Determines the sort order of CLASS variables

BLUP

Computes BLUP/BLUE only

Displayed Output

IC=

Displays a table of information criteria

ITDETAILS

Displays estimates and gradients added to "Iteration History"

MAXCLPRINT=

Specifies the maximum levels of CLASS variables to print

LOGNOTE

Writes periodic status notes to the log

MMEQ

Displays mixed model equations

NOCLPRINT

Suppresses "Class Level Information" completely or in parts

NOITPRINT

Suppresses "Iteration History" table

RANKS

Displays a table of ranks of matrices $\bX $, ($\bX \bZ $), and $\mb{MMEQ}$

SIMPLE

Displays "Descriptive Statistics" table

Singularity Tolerances

SINGCHOL=

Tunes singularity for Cholesky decompositions

SINGRES=

Tunes singularity for the residual variance

SINGULAR=

Tunes general singularity criterion


You can specify the following options.

BLUP<(suboptions)>=SAS-data-set

creates a data set that contains the BLUE and BLUP solutions.The covariance parameters are assumed to be known and given by PARMS statement. All hypothesis testing is ignored. The statements TEST, ESTIMATE, CONTRAST, LSMEANS, and OUTPUT are all ignored. This option is designed for users who need BLUP solutions for random effects with many levels, up to tens of millions.

You can specify the following suboptions:

ITPRINT=number

specifies that the iteration history be displayed after every number of iterations. This suboption applies only for iterative solving methods (IOC or IOD). The default value is 10, which means the procedure displays the iteration history for every 10 iterations.

MAXITER=number

specifies the maximum number of iterations allowed. This applies only for iterative solving methods (IOC or IOD). The default value is the number of parameters in the BLUE/BLUP plus two.

METHOD=DIRECT | IOC | IOD

specifies the method used to solve for BLUP solutions. METHOD=DIRECT requires storing mixed model equations (MMEQ) in memory and computing the Cholesky decomposition of MMEQ. This method is the most accurate, but it is the most inefficient in terms of speed and memory. METHOD=IOD does not build mixed model equations; instead it iterates on data to solve for the solutions. This method is most efficient in terms of memory. METHOD=IOC requires storing mixed model equations in memory and iterates on MMEQ to solve for the solutions. This method is the most efficient in terms of speed. The default method is IOC.

TOL=number

specifies the tolerance value. This suboption applies only for iterative solving methods (IOC or IOD). The default value is the square root of machine precision.

DATA=SAS-data-set

names the SAS data set to be used by PROC HPMIXED. The default is the most recently created data set.

INFOCRIT=NONE | PQ | Q
IC=NONE | PQ | Q

determines the computation of information criteria in the "Fit Statistics" table. The criteria are all in smaller-is-better form, and are described in Table 55.3.

Table 55.3: Information Criteria

Criteria

Formula

Reference

AIC

$-2\ell + 2d$

Akaike (1974)

AICC

$-2\ell + 2d n^*/(n^*-d-1) \mbox{ for } n^* \ge d+2$

Hurvich and Tsai (1989) and

 

$-2\ell + 2d (d+2) \mbox{ for } n^* < d+2$

Burnham and Anderson (1998)

HQIC

$-2\ell + 2d \log (\log (n))$ for $n>1$

Hannan and Quinn (1979)

BIC

$-2\ell + d \log (n)$ for $n>0$

Schwarz (1978)

CAIC

$-2\ell + d(\log (n) + 1)$ for $n>0$

Bozdogan (1987)


Here $\ell $ denotes the maximum value of the restricted log likelihood, d is the dimension of the model, and n, $n^*$ reflect the size of the data. When $n \le 1$, the value of the HQIC criterion is $-2\ell $. When n=0, the values of the BIC and CAIC criteria are undefined.

The quantities d, n, and $n^*$ depend on the model and IC= option.

  • models without random effects: The IC=Q and IC=PQ options have no effect on the computation.

    • d equals the number of parameters in the optimization whose solutions do not fall on the boundary or are otherwise constrained.

    • n equals the number of used observations minus rank(X).

    • $n^*$ equals n, unless n < d + 2, in which case $n^* = d+2$.

  • models with random effects:

    • d equals the number of parameters in the optimization whose solutions do not fall on the boundary or are otherwise constrained. If IC=PQ, this value is incremented by $\mbox{rank}(\bX )$.

    • n equals the effective number of subjects as displayed in the "Dimensions" table, unless this value equals 1, in which case n equals the number of levels of the first random effect specified. The IC=Q and IC=PQ options have no effect.

    • $n^*$ equals n, unless n < d + 2, in which case $n^* = d+2$. The IC=Q and IC=PQ options have no effect.

The IC=NONE option suppresses the "Fit Statistics" table. IC=Q is the default.

ITDETAILS

displays the parameter values at each iteration and enables the writing of notes to the SAS log pertaining to "infinite likelihood" and "singularities" during optimization iterations.

LOGNOTE

writes to the log periodic notes that describe the current status of computations. This option is designed for use with analyses that require extensive CPU resources.

MAXCLPRINT=number

specifies the maximum levels of CLASS variables to print in the ODS table ClassLevels. The default value is 20. MAXCLPRINT=0 enables you to print all levels of each CLASS variable. However, the option NOCLPRINT takes precedence over MAXCLPRINT.

METHOD=

specifies the estimation method for the covariance parameters. The REML specification performs residual (restricted) maximum likelihood, and it is currently the only available method. This option is therefore currently redundant for PROC HPMIXED, but it is included for consistency with other mixed model procedures in SAS/STAT software.

MMEQ

displays coefficients of the mixed model equations. These are

\[ \left[\begin{array}{lr} \bX ’\widehat{\bR }^{-1}\bX & \bX ’\widehat{\bR }^{-1}\bZ \\*\bZ ’\widehat{\bR }^{-1}\bX & \bZ ’\widehat{\bR }^{-1} \bZ + \widehat{\bG }^{-1} \end{array}\right] \left[\begin{array}{c} \bX ’\widehat{\bR }^{-1}\mb{y} \\ \bZ ’\widehat{\bR }^{-1}\mb{y} \end{array} \right] \]

assuming $\widehat{\bG }$ is nonsingular. If $\widehat{\bG }$ is singular, PROC HPMIXED produces the following coefficients

\[ \left[\begin{array}{lr} \bX ’\widehat{\bR }^{-1}\bX & \bX ’\widehat{\bR }^{-1}\bZ \widehat{\bG }\\*\widehat{\bG } \bZ ’\widehat{\bR }^{-1}\bX & \widehat{\bG } \bZ ’\widehat{\bR }^{-1}\bZ \widehat{\bG } + \widehat{\bG } \end{array}\right] \left[\begin{array}{c} \bX ’\widehat{\bR }^{-1}\mb{y} \\ \widehat{\bG } \bZ ’\widehat{\bR }^{-1}\mb{y} \end{array} \right] \]

See the section "Model and Assumptions" for further information about these equations.

NAMELEN=number

specifies the length to which long effect names are shortened. The default and minimum value is 20.

NLPRINT

requests that optimization-related output options specified in the NLOPTIONS statement override corresponding options in the PROC HPMIXED statement. When you specify NLPRINT, the ITDETAILS and NOITPRINT options in the PROC HPMIXED statement are ignored and the following six options in the NLOPTIONS statement are enabled: NOPRINT, PHISTORY, PSUMMARY, PALL, PLONG, and PHISTPARMS.

The syntax and options of the NLOPTIONS statement are described in the section NLOPTIONS Statement in Chapter 19: Shared Concepts and Topics.

NOCLPRINT<=number>

suppresses the display of the "Class Level Information" table if you do not specify number. If you do specify number, only levels with totals that are less than number are listed in the table.

NOFIT

suppresses fitting of the model. When the NOFIT option is in effect, PROC HPMIXED produces the "Model Information," "Class Level Information," "Number of Observations," "Dimensions," and "Descriptive Statistics" tables. These can be helpful in gauging the computational effort required to fit the model.

NOINFO

suppresses the display of the "Model Information," "Number of Observations," and "Dimensions" tables.

NOITPRINT

suppresses the display of the "Iteration History" table.

NOPRINT

suppresses the normal display of results. The NOPRINT option is useful when you want only to create one or more output data sets with the procedure by using the OUTPUT statement. Note that this option temporarily disables the Output Delivery System (ODS); see Chapter 20: Using the Output Delivery System, for more information.

NOPROFILE

includes the residual variance as one of the covariance parameters in the optimization iterations. This option applies only to models that have a residual variance parameter. By default, this parameter is profiled out of the optimization iterations, except when you have specified the HOLD= option in the PARMS statement.

ORDER=DATA | FORMATTED | FREQ | INTERNAL

specifies the sort order for the levels of the classification variables (which are specified in the CLASS statement).

This option applies to the levels for all classification variables, except when you use the (default) ORDER=FORMATTED option with numeric classification variables that have no explicit format. In that case, the levels of such variables are ordered by their internal value.

The ORDER= option can take the following values:

Value of ORDER=

Levels Sorted By

DATA

Order of appearance in the input data set

FORMATTED

External formatted value, except for numeric variables with no explicit format, which are sorted by their unformatted (internal) value

FREQ

Descending frequency count; levels with the most observations come first in the order

INTERNAL

Unformatted value

By default, ORDER=FORMATTED. For ORDER=FORMATTED and ORDER=INTERNAL, the sort order is machine-dependent.

For more information about sort order, see the chapter on the SORT procedure in the Base SAS Procedures Guide and the discussion of BY-group processing in SAS Language Reference: Concepts.

RANKS

displays the ranks of design matrices $\bX $ and ($\bX \bZ $) and the coefficient matrix of the mixed model equations ($\mb{MMEQ}$).

SIMPLE

displays the mean, standard deviation, coefficient of variation, minimum, and maximum for each variable used in PROC HPMIXED that is not a classification variable.

SINGCHOL=number

tunes the singularity criterion in Cholesky decompositions. The default is 1E6 times the machine epsilon; this product is approximately 1E–10 on most computers.

SINGRES=number

sets the tolerance for which the residual variance is considered to be zero. The default is 1E4 times the machine epsilon; this product is approximately 1E–12 on most computers.

SINGULAR=number

tunes the general singularity criterion applied by the HPMIXED procedure in divisions and inversions. The default is 1E4 times the machine epsilon; this product is approximately 1E–12 on most computers.

UPDATE

is an alias for the LOGNOTE option.