Example 97.1 Using the Four General Estimation Methods

In this example, a and b are classification variables and y is the dependent variable. a is declared fixed, and b and a*b are random. Note that this design is unbalanced because the cell sizes are not all the same. PROC VARCOMP is invoked four times, once for each of the general estimation methods. The data are from Hemmerle and Hartley (1973). The following statements produce Output 97.1.1.

data a;
   input a b y @@;
   datalines;
1 1 237   1 1 254   1 1 246   1 2 178   1 2 179
2 1 208   2 1 178   2 1 187   2 2 146   2 2 145   2 2 141
3 1 186   3 1 183   3 2 142   3 2 125   3 2 136
;
proc varcomp method=type1 data=a;
   class a b;
   model y=a|b / fixed=1;
run;

Output 97.1.1 VARCOMP Procedure: Method=TYPE1
Variance Components Estimation Procedure

Class Level Information
Class Levels Values
a 3 1 2 3
b 2 1 2

Number of Observations Read 16
Number of Observations Used 16

Dependent Variable: y

Type 1 Analysis of Variance
Source DF Sum of Squares Mean Square Expected Mean Square
a 2 11736 5868.218750 Var(Error) + 2.725 Var(a*b) + 0.1 Var(b) + Q(a)
b 1 11448 11448 Var(Error) + 2.6308 Var(a*b) + 7.8 Var(b)
a*b 2 299.041026 149.520513 Var(Error) + 2.5846 Var(a*b)
Error 10 786.333333 78.633333 Var(Error)
Corrected Total 15 24270    

Type 1 Estimates
Variance Component Estimate
Var(b) 1448.4
Var(a*b) 27.42659
Var(Error) 78.63333

The "Class Level Information" table in Output 97.1.1 displays the levels of each variable specified in the CLASS statement. You can check this table to make sure the data are input correctly.

The Type I analysis of variance in Output 97.1.1 consists of a sequential partition of the total sum of squares. The mean square is the sum of squares divided by the degrees of freedom, and the expected mean square is the expected value of the mean square under the mixed model. The "Q" notation in the expected mean squares refers to a quadratic form in parameters of the parenthesized effect.

The Type I estimates of the variance components in Output 97.1.1 result from solving the linear system of equations established by equating the observed mean squares to their expected values.

The following statements are the same as before, except that the estimation method is MIVQUE0 instead of the default TYPE1. They produce Output 97.1.2.

proc varcomp method=mivque0 data=a;
   class a b;
   model y=a|b / fixed=1;
run;

Output 97.1.2 VARCOMP Procedure: Method=MIVQUE0
Variance Components Estimation Procedure

MIVQUE(0) SSQ Matrix
Source b a*b Error y
b 60.84000 20.52000 7.80000 89295.4
a*b 20.52000 20.52000 7.80000 30181.3
Error 7.80000 7.80000 13.00000 12533.5

MIVQUE(0) Estimates
Variance Component y
Var(b) 1466.1
Var(a*b) -35.49170
Var(Error) 105.73660

The MIVQUE0 estimates in Output 97.1.2 result from solving the equations established by the MIVQUE0 SSQ matrix. Note that the estimate of the variance component for the interaction effect, Var(a*b), is negative for this example.

The following statements use METHOD=ML to invoke maximum likelihood estimation. They produce Output 97.1.3.

proc varcomp method=ml data=a;
   class a b;
   model y=a|b / fixed=1;
run;

Output 97.1.3 VARCOMP Procedure: Method=ML
Variance Components Estimation Procedure

Maximum Likelihood Iterations
Iteration Objective Var(b) Var(a*b) Var(Error)
0 78.3850371200 1031.49070 0 74.3909717935
1 78.2637043807 732.3606453635 0 77.4011688154
2 78.2635471161 723.6867470850 0 77.5301774839
3 78.2635471152 723.6658365289 0 77.5304926877

Convergence criteria met.

Maximum Likelihood Estimates
Variance Component Estimate
Var(b) 723.66584
Var(a*b) 0
Var(Error) 77.53049

Asymptotic Covariance Matrix of Estimates
  Var(b) Var(a*b) Var(Error)
Var(b) 537826.1 0 -107.33905
Var(a*b) 0 0 0
Var(Error) -107.33905 0 858.71104

The "Maximum Likelihood Iterations" table in Output 97.1.3 shows that the Newton-Raphson algorithm used by PROC VARCOMP requires three iterations to converge.

The ML estimate of Var(a*b) is zero for this example, and the other two estimates are smaller than their Type I and MIVQUE0 counterparts.

One benefit of using likelihood-based methods is that an approximate covariance matrix is available from the matrix of second derivatives evaluated at the ML solution. This covariance matrix is valid asymptotically and can be unreliable in small samples.

Here the variance component estimates for B and the Error are negatively correlated, and the elements for Var(a*b) are set to zero because the estimate equals zero. Also, the very large variance for Var(b) indicates a lot of uncertainty about the estimate for Var(b), and one contributing explanation is that B has only two levels in this data set.

Finally, the following statements use the restricted maximum likelihood (REML) for estimation. They produce Output 97.1.4.

proc varcomp method=reml data=a;
   class a b;
   model y=a|b / fixed=1;
run;

Output 97.1.4 VARCOMP Procedure: Method=REML
Variance Components Estimation Procedure

REML Iterations
Iteration Objective Var(b) Var(a*b) Var(Error)
0 63.4134144942 1269.52701 0 91.5581191305
1 63.0446869787 1601.84199 32.7632417174 76.9355562461
2 63.0311530508 1468.82932 27.2258186561 78.7548276319
3 63.0311265148 1464.33646 26.9564053003 78.8431476502
4 63.0311265127 1464.36727 26.9588525177 78.8423898761

Convergence criteria met.

REML Estimates
Variance Component Estimate
Var(b) 1464.4
Var(a*b) 26.95885
Var(Error) 78.84239

Asymptotic Covariance Matrix of Estimates
  Var(b) Var(a*b) Var(Error)
Var(b) 4401703.8 1.29359 -273.39651
Var(a*b) 1.29359 3559.1 -502.85157
Var(Error) -273.39651 -502.85157 1249.7

The "REML Iterations" table in Output 97.1.4 shows that the REML optimization requires four iterations to converge.

The REML estimates in Output 97.1.4 are all larger than the corresponding ML estimates (adjusting for potential downward bias) and are fairly similar to the Type I estimates.

The "Asymptotic Covariance Matrix of Estimates" table in Output 97.1.4 shows that the Error variance component estimate is negatively correlated with the other two variance component estimates, and the estimated variances are all larger than their ML counterparts.