The HPLMIXED Procedure

Mixed Model Analysis of Covariance with Many Groups

Suppose you are an educational researcher who studies how student scores on math tests change over time. Students are tested four times, and you want to estimate the overall rise or fall, accounting for correlation between test response behaviors of students in the same neighborhood and school. One way to model this correlation is by using a random-effects analysis of covariance, where the scores for students from the same neighborhood and school are all assumed to share the same quadratic mean test response function, the parameters of this response function being random. The following statements simulate a data set with this structure:

data SchoolSample;
   do SchoolID = 1 to 300;
      do nID = 1 to 25;
         Neighborhood = (SchoolID-1)*5 + nId;
         bInt   = 5*ranuni(1);
         bTime  = 5*ranuni(1);
         bTime2 =   ranuni(1);
         do sID = 1 to 2;
            do Time = 1 to 4;
               Math = bInt + bTime*Time + bTime2*Time*Time + rannor(2);

In this data, there are 300 schools and about 1,500 neighborhoods; neighborhoods are associated with more than one school and vice versa. The following statements use PROC HPLMIXED to fit a mixed analysis of covariance model to this data. To run these statements successfully, you need to set the macro variables GRIDHOST and GRIDINSTALLLOC to resolve to appropriate values, or you can replace the references to macro variables with appropriate values.

proc hplmixed data=SchoolSample;
   performance host="&GRIDHOST" install="&GRIDINSTALLLOC" nodes=20;
   class Neighborhood SchoolID;
   model Math = Time Time*Time / solution;
   random   int Time Time*Time / sub=Neighborhood(SchoolID) type=un;

This model fits a quadratic mean response model with an unstructured covariance matrix to model the covariance between the random parameters of the response model. With 7,500 neighborhood/school combinations, this model can be computationally daunting to fit, but PROC HPLMIXED finishes quickly and displays the results shown in Figure 8.1.

Figure 8.1: Mixed Model Analysis of Covariance

The HPLMIXED Procedure

Performance Information
Host Node YourGridHost
Execution Mode Distributed
Number of Compute Nodes 20
Number of Threads per Node 24

Data Access Information
Data Engine Role Path

Model Information
Dependent Variable Math
Covariance Structure Unstructured
Subject Effect Neighborho(SchoolID)
Estimation Method Restricted Maximum Likelihood
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Residual

Class Level Information
Class Levels Values
Neighborhood 1520 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
SchoolID 300 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...

Covariance Parameters 7
Columns in X 3
Columns in Z per Subject 3
Subjects 7500
Max Obs per Subject 8

Number of Observations Read 60000
Number of Observations Used 60000
Number of Observations Not Used 0
Number of Observations Swapped 52500
Number of Subjects Needing Swap 7500

Optimization Information
Optimization Technique Newton-Raphson with Ridging
Parameters in Optimization 6
Lower Boundaries 3
Upper Boundaries 0
Starting Values From Data

Iteration History
Iteration Evaluations Objective
Change Max Gradient
0 2 225641.67142   6.741E-8

Convergence criterion (ABSGCONV=0.00001) satisfied.

Covariance Parameter Estimates
Cov Parm Subject Estimate
UN(1,1) Neighborho(SchoolID) 2.0902
UN(2,1) Neighborho(SchoolID) 0.000349
UN(2,2) Neighborho(SchoolID) 2.0517
UN(3,1) Neighborho(SchoolID) 0.01448
UN(3,2) Neighborho(SchoolID) 0.01599
UN(3,3) Neighborho(SchoolID) 0.08047
Residual   1.0083

Fit Statistics
-2 Res Log Likelihood 225642
AIC (Smaller is Better) 225656
AICC (Smaller is Better) 225656
BIC (Smaller is Better) 225704

Solution for Fixed Effects
Effect Estimate Standard
DF t Value Pr > |t|
Intercept 2.5070 0.02828 6E4 88.66 <.0001
Time 2.5124 0.02659 6E4 94.48 <.0001
Time*Time 0.5010 0.005247 6E4 95.48 <.0001