37047 - How to fix an Out of Memory condition or reduce execution time in PROC MIXED or PROC GLIMMIX

SUPPORT / SAMPLES & SAS NOTES

Support

Usage Note 37047: How to fix an Out of Memory condition or reduce execution time in PROC MIXED or PROC GLIMMIX

The MIXED and GLIMMIX procedures are computationally intensive and execution times can be long. A model may be resource intensive (requiring a large amount of memory or time) if the input data set is large, if the CLASS variables have large numbers of levels, or if certain options are specified in the PROC, MODEL, RANDOM, or REPEATED statements.

If you have a model that encounters an out of memory error or takes too long to run, the following suggestions may be helpful.

Changes to the running environment

To maximize available memory on your system, close all unnecessary applications when running your program.
Use the ODS NORESULTS; statement to prevent tracking of output objects in the Results window if your program generates a large number of results tables, such as when using BY processing with a large number of BY groups or when a macro is used to run the procedure many times.
Submit programs in BATCH mode rather than interactively.

Processing by subjects

When one or more random effects has many levels (say, 1000 or more), the computations can become resource intensive. The following code may take a long time to run or cause an out of memory error when variable B has many levels.
```
   proc mixed;
     class a b;
     model y=a;
     random b;
     run;
```
Below are some alternative specifications of the model which are statistically equivalent but numerically more efficient.
1. Using the SUBJECT= option allows the procedure to process the model by subjects, which typically takes less time and memory. For example:
```
   proc mixed;
     class a b;
     model y=a;
     random intercept / subject=b;
     run;
```
2. If the variable B is a numeric variable, or if it can be easily recoded as a numeric variable, then you can further improve the efficiency of the above model by sorting your data by the subject variable, B, and removing B from the CLASS statement. For example:
```
   proc sort;
     by b;
     run;

   proc mixed;
     class a;
     model y=a;
     random intercept / subject=b;
     run;
```
3. Alternatively, if an equivalent model can be specified using the REPEATED statement rather than the RANDOM statement, you may want to consider using the REPEATED statement. The REPEATED statement is less memory intensive than the RANDOM statement, especially when there are many levels of the SUBJECT= effect. For example, you can rewrite the above model using an equivalent REPEATED statement as follows:
```
   proc mixed;
     class a b;
     model y=a;
     repeated / subject=b type=cs;
     run;
```
4. Similar to the SUBJECT= option in the RANDOM statement, you can further improve the efficiency of the above model by sorting your data by the subject variable, B, and removing B from the CLASS statement:
```
 proc sort;
     by b;
     run;

   proc mixed;
     class a;
     model y=a;
     repeated / subject=b type=cs;
     run;
```
If you have more than one RANDOM effect, and if there is a common effect in all the effects appearing in the RANDOM statement, you can "factor out" that common effect and specify it as the SUBJECT= effect. This creates a block-diagonal G matrix allowing PROC MIXED and PROC GLIMMIX to process the model by subjects, which typically is faster and requires less memory. For example, the first GLIMMIX step below is less efficient than the second GLIMMIX step. Since C appears in all effects in the first RANDOM statement, it can be factored out and used as the SUBJECT= effect in the second RANDOM statement.
```
   proc glimmix;
     class a b c;
     model y=a b / ddfm=satterth;
     random c a*c b*c;
     run;
   
   proc glimmix;
     class a b c;
     model y=a b / ddfm=satterth;
     random int a b/subject=c;
     run;
```
The data processing and estimation in the MIXED or GLIMMIX procedures is a little more complicated when you have multiple RANDOM statements. Both procedures will process the model by subjects if each RANDOM statement has a SUBJECT= effect and if the SUBJECT effects are nested within each other. If possible, use the same or nested SUBJECT= effects in all RANDOM and REPEATED statements. For more information, see "Processing by Subjects" in the Details section of the PROC GLIMMIX documentation.
Multiple subject variables are often encountered in hierarchical linear models (HLMs). This usage note discusses related information about efficient specification of HLMs.

Choosing options that can affect efficiency

The DDFM= option on the MODEL statement specifies the estimation method for the denominator degrees of freedom, some of which can be resource intensive. The DDFM=RESIDUAL or DDFM=BW option in the MODEL statement can reduce the amount of memory required. The DDFM=KR is more memory intensive than DDFM=SATTERTH. The DDFM=SATTERTH and DDFM=BW can be faster than DDFM=CONTAINMENT.
The different types of covariance structures available in the REPEATED and the RANDOM statements can affect the memory requirements for estimating a model. Generally speaking, more complex covariance structures such as TYPE=UN or the spatial structures are more resource intensive than the simpler covariance structures. You may want to use a simpler covariance structure to reduce the memory or time requirements. In particular:
- The TYPE=UN option in the REPEATED statement in PROC MIXED or the RANDOM _RESIDUAL_ statement in PROC GLIMMIX may not be appropriate if subjects have many repeated measurements. This covariance structure requires estimation of a large number of parameters and may require excessive memory. You may want to consider simpler alternative structures such as TYPE=TOEP.
- The TYPE=UN option in the RANDOM statement in PROC MIXED or in PROC GLIMMIX may not be appropriate for a random coefficients model with many covariates in the RANDOM statement. Again, the TYPE=UN structure requires estimation of a large number of parameters and may require excessive memory. You may want to consider the simpler TYPE=VC structure.
Use of the NOBOUND option on the PROC MIXED or PARMS statement requires more resources because of changes in the computational algorithm when there are negative variance components.
The NOCLPRINT option in the PROC MIXED or GLIMMIX statement suppresses the CLASS Level Information table and can help reduce memory and execution time.
The SOLUTION, OUTP=, OUTPM=, and INFLUENCE options in PROC MIXED and the SOLUTION option and the OUTPUT statement in PROC GLIMMIX require additional resources. ODS Graphics can also be resource intensive. Therefore, you may want to fit your model without any these options first.

Choosing an alternative procedure

For linear mixed models with thousands of levels for the fixed and/or random effects, or for linear mixed models with hierarchically nested fixed and/or random effects with hundreds or thousands of levels at each level of the hierarchy, the experimental procedure PROC HPMIXED may be used. For more information about this procedure, see the PROC HPMIXED documentation.

Operating System and Release Information

Product Family	Product	System	SAS Release
Product Family	Product	System	Reported	Fixed*
SAS System	SAS/STAT	z/OS
		OpenVMS VAX
		Microsoft® Windows® for 64-Bit Itanium-based Systems
		Microsoft Windows Server 2003 Datacenter 64-bit Edition
		Microsoft Windows Server 2003 Enterprise 64-bit Edition
		Microsoft Windows XP 64-bit Edition
		Microsoft® Windows® for x64
		OS/2
		Microsoft Windows 95/98
		Microsoft Windows 2000 Advanced Server
		Microsoft Windows 2000 Datacenter Server
		Microsoft Windows 2000 Server
		Microsoft Windows 2000 Professional
		Microsoft Windows NT Workstation
		Microsoft Windows Server 2003 Datacenter Edition
		Microsoft Windows Server 2003 Enterprise Edition
		Microsoft Windows Server 2003 Standard Edition
		Microsoft Windows Server 2008
		Microsoft Windows XP Professional
		Windows Millennium Edition (Me)
		Windows Vista
		64-bit Enabled AIX
		64-bit Enabled HP-UX
		64-bit Enabled Solaris
		ABI+ for Intel Architecture
		AIX
		HP-UX
		HP-UX IPF
		IRIX
		Linux
		Linux for x64
		Linux on Itanium
		OpenVMS Alpha
		OpenVMS on HP Integrity
		Solaris
		Solaris for x64
		Tru64 UNIX

* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.

Type:	Usage Note
Priority:
Topic:	Analytics ==> Mixed Models SAS Reference ==> Procedures ==> GLIMMIX SAS Reference ==> Procedures ==> MIXED SAS Reference ==> Procedures ==> HPMIXED

Date Modified:	2009-10-30 11:26:50
Date Created:	2009-09-01 09:19:25