The VARCOMP Procedure

Example 121.1 Using the Four General Estimation Methods

In this example, a and b are classification variables and y is the dependent variable. a is declared fixed, and b and a*b are random. Note that this design is unbalanced because the cell sizes are not all the same. PROC VARCOMP is invoked four times, once for each of the general estimation methods. The data are from Hemmerle and Hartley (1973). The following statements produce Output 121.1.1.

data a;
   input a b y @@;
   datalines;
1 1 237   1 1 254   1 1 246   1 2 178   1 2 179
2 1 208   2 1 178   2 1 187   2 2 146   2 2 145   2 2 141
3 1 186   3 1 183   3 2 142   3 2 125   3 2 136
;

proc varcomp method=type1 data=a;
   class a b;
   model y=a|b / fixed=1;
run;

Output 121.1.1: VARCOMP Procedure: Method=TYPE1

Variance Components Estimation Procedure

Class Level Information
Class	Levels	Values
a	3	1 2 3
b	2	1 2

Number of Observations Read	16
Number of Observations Used	16

Dependent Variable:	y

Type 1 Analysis of Variance
Source	DF	Sum of Squares	Mean Square	Expected Mean Square
a	2	11736	5868.218750	Var(Error) + 2.725 Var(a*b) + 0.1 Var(b) + Q(a)
b	1	11448	11448	Var(Error) + 2.6308 Var(a*b) + 7.8 Var(b)
a*b	2	299.041026	149.520513	Var(Error) + 2.5846 Var(a*b)
Error	10	786.333333	78.633333	Var(Error)
Corrected Total	15	24270

Type 1 Estimates
Variance Component	Estimate
Var(b)	1448.4
Var(a*b)	27.42659
Var(Error)	78.63333

The "Class Level Information" table in Output 121.1.1 displays the levels of each variable specified in the CLASS statement. You can check this table to make sure the data are input correctly.

The Type I analysis of variance in Output 121.1.1 consists of a sequential partition of the total sum of squares. The mean square is the sum of squares divided by the degrees of freedom, and the expected mean square is the expected value of the mean square under the mixed model. The "Q" notation in the expected mean squares refers to a quadratic form in parameters of the parenthesized effect.

The Type I estimates of the variance components in Output 121.1.1 result from solving the linear system of equations established by equating the observed mean squares to their expected values.

The following statements are the same as before, except that the estimation method is MIVQUE0 instead of the default TYPE1. They produce Output 121.1.2.

proc varcomp method=mivque0 data=a;
   class a b;
   model y=a|b / fixed=1;
run;

Output 121.1.2: VARCOMP Procedure: Method=MIVQUE0

Variance Components Estimation Procedure

MIVQUE(0) SSQ Matrix
Source	b	a*b	Error	y
b	60.84000	20.52000	7.80000	89295.4
a*b	20.52000	20.52000	7.80000	30181.3
Error	7.80000	7.80000	13.00000	12533.5

MIVQUE(0) Estimates
Variance Component	y
Var(b)	1466.1
Var(a*b)	-35.49170
Var(Error)	105.73660

The MIVQUE0 estimates in Output 121.1.2 result from solving the equations established by the MIVQUE0 SSQ matrix. Note that the estimate of the variance component for the interaction effect, Var(a*b), is negative for this example.

The following statements use METHOD=ML to invoke maximum likelihood estimation. They produce Output 121.1.3.

proc varcomp method=ml data=a;
   class a b;
   model y=a|b / fixed=1;
run;

Output 121.1.3: VARCOMP Procedure: Method=ML

Variance Components Estimation Procedure

Maximum Likelihood Iterations
Iteration	Objective	Var(b)	Var(a*b)	Var(Error)
0	78.3850371200	1031.49070	0	74.3909717935
1	78.2637043807	732.3606453635	0	77.4011688154
2	78.2635471161	723.6867470850	0	77.5301774839
3	78.2635471152	723.6658365289	0	77.5304926877

Convergence criteria met.

Maximum Likelihood Estimates
Variance Component	Estimate
Var(b)	723.66584
Var(a*b)	0
Var(Error)	77.53049

Asymptotic Covariance Matrix of Estimates
	Var(b)	Var(a*b)	Var(Error)
Var(b)	537826.1	0	-107.33905
Var(a*b)	0	0	0
Var(Error)	-107.33905	0	858.71104

The "Maximum Likelihood Iterations" table in Output 121.1.3 shows that the Newton-Raphson algorithm used by PROC VARCOMP requires three iterations to converge.

The ML estimate of Var(a*b) is zero for this example, and the other two estimates are smaller than their Type I and MIVQUE0 counterparts.

One benefit of using likelihood-based methods is that an approximate covariance matrix is available from the matrix of second derivatives evaluated at the ML solution. This covariance matrix is valid asymptotically and can be unreliable in small samples.

Here the variance component estimates for B and the Error are negatively correlated, and the elements for Var(a*b) are set to zero because the estimate equals zero. Also, the very large variance for Var(b) indicates a lot of uncertainty about the estimate for Var(b), and one contributing explanation is that B has only two levels in this data set.

Finally, the following statements use the restricted maximum likelihood (REML) for estimation. They produce Output 121.1.4.

proc varcomp method=reml data=a;
   class a b;
   model y=a|b / fixed=1;
run;

Output 121.1.4: VARCOMP Procedure: Method=REML

Variance Components Estimation Procedure

REML Iterations
Iteration	Objective	Var(b)	Var(a*b)	Var(Error)
0	63.4134144942	1269.52701	0	91.5581191305
1	63.0446869787	1601.84199	32.7632417174	76.9355562461
2	63.0311530508	1468.82932	27.2258186561	78.7548276319
3	63.0311265148	1464.33646	26.9564053003	78.8431476502
4	63.0311265127	1464.36727	26.9588525177	78.8423898761

Convergence criteria met.

REML Estimates
Variance Component	Estimate
Var(b)	1464.4
Var(a*b)	26.95885
Var(Error)	78.84239

Asymptotic Covariance Matrix of Estimates
	Var(b)	Var(a*b)	Var(Error)
Var(b)	4401703.8	1.29359	-273.39651
Var(a*b)	1.29359	3559.1	-502.85157
Var(Error)	-273.39651	-502.85157	1249.7

The "REML Iterations" table in Output 121.1.4 shows that the REML optimization requires four iterations to converge.

The REML estimates in Output 121.1.4 are all larger than the corresponding ML estimates (adjusting for potential downward bias) and are fairly similar to the Type I estimates.

The "Asymptotic Covariance Matrix of Estimates" table in Output 121.1.4 shows that the Error variance component estimate is negatively correlated with the other two variance component estimates, and the estimated variances are all larger than their ML counterparts.