37109 - Obtaining subject-specific parameter estimates and tests for a random coefficients model or HLM with PROC MIXED or PROC GLIMMIX

Usage Note 37109: Obtaining subject-specific parameter estimates and tests for a random coefficients model or HLM with PROC MIXED or PROC GLIMMIX

A random coefficients model is an example of a linear mixed model. In random coefficients models, the subject-to-subject variations are modeled through each subject's regression coefficients (intercepts and slopes). The following simple example of a random coefficients model using PROC MIXED involves a single fixed effect regressor and random effects on the intercept and regressor slope for each of a series of subjects. The SOLUTION option in the MODEL statement gives you the overall slope on the covariate and overall intercept for the model while the SOLUTION option in the RANDOM statement gives you the deviation from the overall slope and intercept for each subject (variety).

      proc mixed data=wheat;
         class variety;
         model yield = moist /  solution;
         random int moist / type=un subject=variety solution;
         run;

While PROC MIXED and PROC GLIMMIX report the overall regression coefficients as well as the deviation in the regression coefficients from the overall coefficients for each subject, the procedures do not produce the final subject-specific regression coefficients. To obtain the subject-specific regression coefficients in a random coefficients model, you need to add the individual adjustment for each subject from the SOLUTION option in the RANDOM statement to the overall estimates from the SOLUTION option in the MODEL statement.

The subject-specific intercepts and slopes can be obtained using one of two approaches:

Use the DATA step to merge the results of the associated ODS tables for these statistics
Write an ESTIMATE statement for each coefficient for each subject. Since many ESTIMATE statements may be required, a macro can be used to generate the ESTIMATE statements needed to estimate the coefficients for each subject.

The following sections illustrate each approach. Please note that when PROC MIXED is taking a long time due to possibly a large number of random effecs, PROC HPMIXED can be a good alternative.

Using DATA step programming

The following data set contains data for five randomly selected wheat varieties. Each variety was assigned to six one-acre plots of land. From each plot of land, the yield and the amount of moisture were measured.

      data wheat;
         input id variety yield moist;
         datalines;
       1       1         41       10
       2       1         69       57
       3       1         53       32
       4       1         66       52
       5       1         64       47
       6       1         64       48
       7       2         49       30
       8       2         44       21
       9       2         44       20
      10       2         46       26
      11       2         57       44
      12       2         42       19
      13       3         69       50
      14       3         62       40
      15       3         50       23
      16       3         76       58
      17       3         48       21
      18       3         55       30
      19       4         48       22
      20       4         60       40
      21       4         45       17
      22       4         47       21
      23       4         62       44
      24       4         43       13
      25       5         65       49
      26       5         63       44
      27       5         71       57
      28       5         68       51
      29       5         52       27
      30       5         68       52
      ;

The following PROC MIXED step fits a random coefficients model to these data with MODEL and RANDOM statements as discussed above. The DDFM=KR option in the MODEL statement uses the Kenward-Roger estimation method for calculating degrees of freedom. The ODS OUTPUT statement is used to save the solutions for the fixed and random effects to SAS data sets. DATA step programming can then be applied to obtain the coefficients. Note that this approach only gives you the estimate of the subject-specific coefficients. It does not provide standard errors nor does it test the significance of these estimates.

      proc mixed data=wheat;
         class variety;
         model yield = moist / ddfm=kr solution;
         random int moist / type=un subject=variety solution;
         ods output solutionf=sf(keep=effect estimate  
                                 rename=(estimate=overall));
         ods output solutionr=sr(keep=effect variety estimate
                                 rename=(estimate=ssdev));
         run;

The following table from the SOLUTION option in the MODEL statement shows the overall intercept and slope in the Estimate column.

Solution for Fixed Effects
Effect	Estimate	Standard Error	DF	t Value	Pr > \|t\|
Intercept	33.5483	0.6173	3.73	54.34	<.0001
moist	0.6384	0.02888	3.99	22.10	<.0001

The following table from the SOLUTION option in the RANDOM statement displays the deviation from the overall intercept and slope for each variety in the Estimate column.

Solution for Random Effects
Effect	variety	Estimate	Std Err Pred	DF	t Value	Pr > \|t\|
Intercept	1	0.7020	0.8532	4.25	0.82	0.4543
moist	1	-0.02341	0.03188	4.53	-0.73	0.4990
Intercept	2	-1.5262	0.8968	4.07	-1.70	0.1627
moist	2	-0.07453	0.03693	5.65	-2.02	0.0931
Intercept	3	-0.3519	0.8651	4.22	-0.41	0.7040
moist	3	0.08669	0.03271	4.73	2.65	0.0480
Intercept	4	0.5374	0.8126	4.11	0.66	0.5436
moist	4	0.000696	0.03401	5.15	0.02	0.9844
Intercept	5	0.6386	1.0841	3.66	0.59	0.5902
moist	5	0.01055	0.03438	4.9	0.31	0.7715

The following steps sort the overall and deviations data sets and then merge their information together for each effect. The subject-specific coefficient deviations are added to the overall coefficients in the DATA step to obtain the subject-specific intercepts and slopes.

      proc sort data=sf; 
         by effect; 
         run;
      
      proc sort data=sr; 
         by effect; 
         run;

      data final;
         merge sf sr;
         by effect;
         sscoeff = overall + ssdev;
         run;

These statements sort and display the final parameter estimates.

      proc sort data=final; 
         by variety effect; 
         run;

      proc print data=final noobs; 
        var effect variety overall ssdev sscoeff;
        run;

Below are the final parameter estimates for the random coefficients using the DATA step approach. Each observation from the "Solution for Random Effects" table is merged with the appropriate entry (either intercept or slope on MOIST) from the "Solution for Fixed Effects" table to provide the combined estimates. OVERALL represents the parameter estimates from the SOLUTION option in the MODEL statement. SSDEV represents the parameter estimates from the SOLUTION option in the RANDOM statement. SSCOEFF represents the final subject-specific parameter estimates.

Effect	variety	overall	ssdev	sscoeff
Intercept	1	33.5483	0.7020	34.2503
moist	1	0.6384	-0.02341	0.6150
Intercept	2	33.5483	-1.5262	32.0221
moist	2	0.6384	-0.07453	0.5639
Intercept	3	33.5483	-0.3519	33.1964
moist	3	0.6384	0.08669	0.7251
Intercept	4	33.5483	0.5374	34.0857
moist	4	0.6384	0.000696	0.6391
Intercept	5	33.5483	0.6386	34.1870
moist	5	0.6384	0.01055	0.6490

The lack of standard errors and significance tests for the subject-specific coefficients using this approach can be addressed by using ESTIMATE statements which is discussed next.

Writing ESTIMATE statements to obtain final parameter estimates and tests

You can use an ESTIMATE statement to get the appropriate standard errors for the final parameter estimates. Include one ESTIMATE statement for each subject-specific intercept and slope parameter. The advantage of using an ESTIMATE statement is that it will provide the subject-specific regression coefficient, its standard error, and a test of significance. For an ESTIMATE statement that involves random effects, use the vertical bar (|) to separate random effects from fixed effects. If the RANDOM effects include the SUBJECT= option, use / subject 1 to denote the first subject, / subject 0 1 to denote the second subject, and so on. For details on the ESTIMATE statement syntax, see "ESTIMATE statement" in the Details section of the MIXED documentation. This note provides additional information on writing CONTRAST and ESTIMATE statements.

The following statements demonstrate the ESTIMATE statements needed for the wheat example. The ODS SELECT statement displays only the table of results from the ESTIMATE statements.

      proc mixed data=wheat;
         class variety;
         model yield = moist / ddfm=kr solution;
         random int moist / type=un subject=variety solution ;
         estimate 'intercept for variety 1' int 1 | int 1 / subject 1;
         estimate 'slope for variety 1' moist 1 | moist 1 / subject 1;
         estimate 'intercept for variety 2' int 1 | int 1 / subject 0 1;
         estimate 'slope for variety 2' moist 1 | moist 1 / subject 0 1;
         estimate 'intercept for variety 3' int 1 | int 1 / subject 0 0 1;
         estimate 'slope for variety 3' moist 1 | moist 1 / subject 0 0 1;
         estimate 'intercept for variety 4' int 1 | int 1 / subject 0 0 0 1;
         estimate 'slope for variety 4' moist 1 | moist 1 / subject 0 0 0 1;
         estimate 'intercept for variety 5' int 1 | int 1 / subject 0 0 0 0 1;
         estimate 'slope for variety 5' moist 1 | moist 1 / subject 0 0 0 0 1;
         ods output estimates=sscoeff;
         run;

The table of the final parameter estimates, their standard errors, and tests for the subject-specific coefficients using the ESTIMATE statement approach is displayed below. The t test of each subject-specific regression coefficient tests the null hypothesis that the intercept or slope for each variety is zero. Note that the subject-specific regression coefficients are identical to those obtained using the DATA step approach above.

Estimates
Label	Estimate	Standard Error	DF	t Value	Pr > \|t\|
intercept for variety 1	34.2503	0.7174	19.7	47.74	<.0001
slope for variety 1	0.6150	0.01620	21.2	37.96	<.0001
intercept for variety 2	32.0221	0.8227	13.9	38.92	<.0001
slope for variety 2	0.5639	0.02923	15	19.29	<.0001
intercept for variety 3	33.1964	0.7456	18.5	44.53	<.0001
slope for variety 3	0.7251	0.01878	19.9	38.61	<.0001
intercept for variety 4	34.0857	0.6350	20.7	53.68	<.0001
slope for variety 4	0.6391	0.02202	21.8	29.03	<.0001
intercept for variety 5	34.1870	1.1560	7.43	29.57	<.0001
slope for variety 5	0.6490	0.02427	8.2	26.73	<.0001

Using a macro to generate the ESTIMATE statements

The advantage of using an ESTIMATE statement is that it will provide the subject-specific regression coefficients, standard errors, and significance tests. However, when there are many subjects, writing the many ESTIMATE statements required can be tedious. The following macros provide an efficient way of generating the PROC MIXED statements for the analysis, including the multiple ESTIMATE statements needed to compute the subject-specific coefficients.

The EstimateStatement macro generates the PROC MIXED statements to fit the above random coefficients model to the wheat data, and uses a DO loop to write the set of ESTIMATE statements for subjects. Inside the loop, the ESTIMATE statements are created as above. The Zeroes macro is used to generate the appropriate number of zeroes followed by a 1 in the SUBJECT portion of each ESTIMATE statement.

      %macro Zeroes(numzeroes);
         %local i;
         %do i = 1 %to %eval(&numzeroes-1);
       0
         %end;
       1;
      %mend;
      
      %macro EstimateStatement(numsubjects=);
         %local i;
         proc mixed data=wheat;
            class variety;
            model yield = moist / ddfm=kr solution;
            random int moist / type=un subject=variety solution ;
             %do i = 1 %to &numsubjects;
                estimate "Intercept for Variety &i" int 1 | int 1 / subject %Zeroes(&i);
                estimate "Slope for Variety &i" moist 1 | moist 1 / subject %Zeroes(&i);
             %end;
             ods output estimates=sscoeff;
             run;
      %mend;

The following statement invokes the macro which generates the same PROC MIXED statements, including the necessary ESTIMATE statements, as shown above.

      %EstimateStatement(numsubjects=5)

See this note for an example of how to create a plot of a fitted random coefficients model.

Operating System and Release Information

Product Family	Product	System	SAS Release
Product Family	Product	System	Reported	Fixed*
SAS System	SAS/STAT	z/OS
		OpenVMS VAX
		Microsoft® Windows® for 64-Bit Itanium-based Systems
		Microsoft Windows Server 2003 Datacenter 64-bit Edition
		Microsoft Windows Server 2003 Enterprise 64-bit Edition
		Microsoft Windows XP 64-bit Edition
		Microsoft® Windows® for x64
		OS/2
		Microsoft Windows 95/98
		Microsoft Windows 2000 Advanced Server
		Microsoft Windows 2000 Datacenter Server
		Microsoft Windows 2000 Server
		Microsoft Windows 2000 Professional
		Microsoft Windows NT Workstation
		Microsoft Windows Server 2003 Datacenter Edition
		Microsoft Windows Server 2003 Enterprise Edition
		Microsoft Windows Server 2003 Standard Edition
		Microsoft Windows Server 2008
		Microsoft Windows XP Professional
		Windows Millennium Edition (Me)
		Windows Vista
		64-bit Enabled AIX
		64-bit Enabled HP-UX
		64-bit Enabled Solaris
		ABI+ for Intel Architecture
		AIX
		HP-UX
		HP-UX IPF
		IRIX
		Linux
		Linux for x64
		Linux on Itanium
		OpenVMS Alpha
		OpenVMS on HP Integrity
		Solaris
		Solaris for x64
		Tru64 UNIX

* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.

Type:	Usage Note
Priority:
Topic:	Analytics ==> Mixed Models SAS Reference ==> Procedures ==> GLIMMIX SAS Reference ==> Procedures ==> MIXED

Date Modified:	2010-08-27 10:05:20
Date Created:	2009-09-07 22:32:53

Support

Usage Note 37109: Obtaining subject-specific parameter estimates and tests for a random coefficients model or HLM with PROC MIXED or PROC GLIMMIX

Using DATA step programming

Writing ESTIMATE statements to obtain final parameter estimates and tests

Operating System and Release Information