Suppose you are an educational researcher who studies how student scores on math tests change over time. Students are tested four times, and you want to estimate the overall rise or fall, accounting for correlation between test response behaviors of students in the same neighborhood and school. One way to model this correlation is by using a random-effects analysis of covariance, where the scores for students from the same neighborhood and school are all assumed to share the same quadratic mean test response function, the parameters of this response function being random. The following statements simulate a data set with this structure:

data SchoolSample; do SchoolID = 1 to 300; do nID = 1 to 25; Neighborhood = (SchoolID-1)*5 + nId; bInt = 5*ranuni(1); bTime = 5*ranuni(1); bTime2 = ranuni(1); do sID = 1 to 2; do Time = 1 to 4; Math = bInt + bTime*Time + bTime2*Time*Time + rannor(2); output; end; end; end; end; run;

In this data, there are 300 schools and about 1,500 neighborhoods; neighborhoods are associated with more than one school and vice versa. The following statements use PROC HPLMIXED to fit a mixed analysis of covariance model to this data. To run these statements successfully, you need to set the macro variables GRIDHOST and GRIDINSTALLLOC to resolve to appropriate values, or you can replace the references to macro variables with appropriate values.

proc hplmixed data=SchoolSample; performance host="&GRIDHOST" install="&GRIDINSTALLLOC" nodes=20; class Neighborhood SchoolID; model Math = Time Time*Time / solution; random int Time Time*Time / sub=Neighborhood(SchoolID) type=un; run;

This model fits a quadratic mean response model with an unstructured covariance matrix to model the covariance between the random parameters of the response model. With 7,500 neighborhood/school combinations, this model can be computationally daunting to fit, but PROC HPLMIXED finishes quickly and displays the results shown in Figure 8.1.

Figure 8.1: Mixed Model Analysis of Covariance

The HPLMIXED Procedure

Performance Information | |
---|---|

Host Node | greenarrow.unx.sas.com |

Execution Mode | Distributed |

Grid Mode | Symmetric |

Number of Compute Nodes | 20 |

Number of Threads per Node | 24 |

Model Information | |
---|---|

Data Set | WORK.SCHOOLSAMPLE |

Dependent Variable | Math |

Covariance Structure | Unstructured |

Subject Effect | Neighborho(SchoolID) |

Estimation Method | Restricted Maximum Likelihood |

Residual Variance Method | Profile |

Fixed Effects SE Method | Model-Based |

Degrees of Freedom Method | Residual |

Class Level Information | ||
---|---|---|

Class | Levels | Values |

Neighborhood | 1520 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... |

SchoolID | 300 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... |

Dimensions | |
---|---|

Covariance Parameters | 7 |

Columns in X | 3 |

Columns in Z per Subject | 3 |

Subjects | 7500 |

Max Obs per Subject | 8 |

Number of Observations Read | 60000 |
---|---|

Number of Observations Used | 60000 |

Number of Observations Not Used | 0 |

Number of Observations Swapped | 52500 |

Number of Subjects Needing Swap | 7500 |

Optimization Information | |
---|---|

Optimization Technique | Newton-Raphson with Ridging |

Parameters in Optimization | 6 |

Lower Boundaries | 3 |

Upper Boundaries | 0 |

Starting Values From | Data |

Iteration History | ||||
---|---|---|---|---|

Iteration | Evaluations | Objective Function |
Change | Max Gradient |

0 | 2 | 225641.67142 | 6.741E-8 |

Convergence criterion (ABSGCONV=0.00001) satisfied. |

Covariance Parameter Estimates | ||
---|---|---|

Cov Parm | Subject | Estimate |

UN(1,1) | Neighborho(SchoolID) | 2.0902 |

UN(2,1) | Neighborho(SchoolID) | 0.000349 |

UN(2,2) | Neighborho(SchoolID) | 2.0517 |

UN(3,1) | Neighborho(SchoolID) | 0.01448 |

UN(3,2) | Neighborho(SchoolID) | 0.01599 |

UN(3,3) | Neighborho(SchoolID) | 0.08047 |

Residual | 1.0083 |

Fit Statistics | |
---|---|

-2 Res Log Likelihood | 225642 |

AIC (Smaller is Better) | 225656 |

AICC (Smaller is Better) | 225656 |

BIC (Smaller is Better) | 225704 |

Solution for Fixed Effects | |||||
---|---|---|---|---|---|

Effect | Estimate | Standard Error | DF | t Value | Pr > |t| |

Intercept | 2.5070 | 0.02828 | 6E4 | 88.66 | <.0001 |

Time | 2.5124 | 0.02659 | 6E4 | 94.48 | <.0001 |

Time*Time | 0.5010 | 0.005247 | 6E4 | 95.48 | <.0001 |