The MIXED Procedure

REPEATED Statement

REPEATED <repeated-effect> </ options> ;

The REPEATED statement is used to specify the $\mb {R}$ matrix in the mixed model. Its syntax is different from that of the REPEATED statement in PROC GLM. If no REPEATED statement is specified, $\mb {R}$ is assumed to be equal to $\sigma ^2\mb {I}$ .

For many repeated measures models, no repeated effect is required in the REPEATED statement. Simply use the SUBJECT= option to define the blocks of $\mb {R}$ and the TYPE= option to define their covariance structure. In this case, the repeated measures data must be similarly ordered for each subject, and you must indicate all missing response variables with periods in the input data set unless they all fall at the end of a subject’s repeated response profile. These requirements are necessary in order to inform PROC MIXED of the proper location of the observed repeated responses.

Specifying a repeated effect is useful when you do not want to indicate missing values with periods in the input data set. The repeated effect must contain only classification variables. Make sure that the levels of the repeated effect are different for each observation within a subject; otherwise, PROC MIXED constructs identical rows in $\mb {R}$ corresponding to the observations with the same level. This results in a singular $\mb {R}$ and an infinite likelihood.

Whether you specify a REPEATED effect or not, the rows of $\mb {R}$ for each subject are constructed in the order in which they appear in the input data set.

Table 63.16 summarizes the options available in the REPEATED statement. All options are subsequently discussed in alphabetical order.

Table 63.16: Summary of REPEATED Statement Options

Option	Description
Construction of Covariance Structure
GROUP=	Defines an effect specifying heterogeneity in the R-side covariance structure
LDATA=	Specifies data set with coefficient matrices for TYPE=LIN
LOCAL	Requests that a diagonal matrix be added to $\mb {R}$
LOCALW	Specifies that only the local effects are weighted
NONLOCALW	Specifies that only the nonlocal effects are weighted
SUBJECT=	Identifies the subjects in the R-side model
TYPE=	Specifies the R-side covariance structure
Statistical Output
HLM	Produces a table of Hotelling-Lawley-McKeon statistics (McKeon, 1974)
HLPS	Produces a table of Hotelling-Lawley-Pillai-Samson statistics (Pillai and Samson, 1959)
R	Displays blocks of the estimated $\mb {R}$ matrix
RC	Display the Cholesky root (lower) of blocks of the estimated $\mb {R}$ matrix
RCI	Displays the inverse Cholesky root (lower) of blocks of the estimated $\mb {R}$ matrix
RCORR	Displays the correlation matrix corresponding to blocks of the estimated $\mb {R}$ matrix
RI	Displays the inverse of blocks of the estimated $\mb {R}$ matrix

You can specify the following options in the REPEATED statement after a slash (/).

GROUP=effect GRP=effect

defines an effect that specifies heterogeneity in the covariance structure of $\mb {R}$ . All observations that have the same level of the GROUP effect have the same covariance parameters. Each new level of the GROUP effect produces a new set of covariance parameters with the same structure as the original group. You should exercise caution in properly defining the GROUP effect, because strange covariance patterns can result with its misuse. Also, the GROUP effect can greatly increase the number of estimated covariance parameters, which can adversely affect the optimization process.

Continuous variables are permitted as arguments to the GROUP= option. PROC MIXED does not sort by the values of the continuous variable; rather, it considers the data to be from a new subject or group whenever the value of the continuous variable changes from the previous observation. Using a continuous variable decreases execution time for models with a large number of subjects or groups and also prevents the production of a large “Class Level Information” table.

HLM

produces a table of Hotelling-Lawley-McKeon statistics (McKeon, 1974) for all fixed effects whose levels change across data having the same level of the SUBJECT= effect (the within-subject fixed effects). This option applies only when you specify a REPEATED statement with the TYPE=UN option and no RANDOM statements. For balanced data, this model is equivalent to the multivariate model for repeated measures in PROC GLM.

The Hotelling-Lawley-McKeon statistic has a slightly better F approximation than the Hotelling-Lawley-Pillai-Samson statistic (see the description of the HLPS option, which follows). Both of the Hotelling-Lawley statistics can perform much better in small samples than the default F statistic (Wright, 1994).

Separate tables are produced for Type 1, 2, and 3 tests, according to the ones you select. The ODS table names are HLM1, HLM2, and HLM3, respectively.

HLPS

produces a table of Hotelling-Lawley-Pillai-Samson statistics (Pillai and Samson, 1959) for all fixed effects whose levels change across data having the same level of the SUBJECT= effect (the within-subject fixed effects). This option applies only when you specify a REPEATED statement with the TYPE=UN option and no RANDOM statements. For balanced data, this model is equivalent to the multivariate model for repeated measures in PROC GLM, and this statistic is the same as the Hotelling-Lawley Trace statistic produced by PROC GLM.

Separate tables are produced for Type 1, 2, and 3 tests, according to the ones you select. The ODS table names are HLPS1, HLPS2, and HLPS3, respectively.

LDATA=SAS-data-set

reads the coefficient matrices associated with the TYPE=LIN(number) option. The data set must contain the variables Parm, Row, Col1–Coln or Parm, Row, Col, Value. The Parm variable denotes which of the number coefficient matrices is currently being constructed, and the Row, Col1–Coln, or Row, Col, Value variables specify the matrix values, as they do with the RANDOM statement option GDATA=. Unspecified values of these matrices are set equal to 0.

LOCAL LOCAL=EXP(<effects>) LOCAL=POM(POM-data-set)

requests that a diagonal matrix be added to $\mb {R}$ . With just the LOCAL option, this diagonal matrix equals $\sigma ^2\mb {I}$ , and $\sigma ^2$ becomes an additional variance parameter that PROC MIXED profiles out of the likelihood provided that you do not specify the NOPROFILE option in the PROC MIXED statement. The LOCAL option is useful if you want to add an observational error to a time series structure (Jones and Boadi-Boateng, 1991) or a nugget effect to a spatial structure Cressie (1993).

The LOCAL=EXP(<effects>) option produces exponential local effects, also known as dispersion effects, in a log-linear variance model. These local effects have the form

$\sigma ^{2}\mr {diag} [\mr {exp} (\bU \bdelta )]$

where $\bU$ is the full-rank design matrix corresponding to the effects that you specify and $\bdelta$ are the parameters that PROC MIXED estimates. An intercept is not included in $\bU$ because it is accounted for by $\sigma ^{2}$ . PROC MIXED constructs the full-rank $\bU$ in terms of 1s and –1s for classification effects. Be sure to scale continuous effects in $\bU$ sensibly.

The LOCAL=POM(POM-data-set) option specifies the power-of-the-mean structure. This structure possesses a variance of the form $\sigma ^2 |\mb {x} ’_ i\bbeta ^*|^\theta$ for the ith observation, where $\mb {x} _ i$ is the ith row of $\mb {X}$ (the design matrix of the fixed effects) and $\bbeta ^*$ is an estimate of the fixed-effects parameters that you specify in POM-data-set.

The SAS data set specified by POM-data-set contains the numeric variable Estimate (in previous releases, the variable name was required to be EST), and it has at least as many observations as there are fixed-effects parameters. The first p observations of the Estimate variable in POM-data-set are taken to be the elements of $\bbeta ^*$ , where p is the number of columns of $\bX$ . You must order these observations according to the non-full-rank parameterization of the MIXED procedure. One easy way to set up POM-data-set for a $\bbeta ^*$ corresponding to ordinary least squares is illustrated by the following statements:

ods output SolutionF=sf;
proc mixed;
   class a;
   model y = a x / s;
run;

proc mixed;
   class a;
   model y = a x;
   repeated / local=pom(sf);
run;

Note that the generalized least squares estimate of the fixed-effects parameters from the second PROC MIXED step usually is not the same as your specified $\bbeta ^*$ . However, you can iterate the POM fitting until the two estimates agree. Continuing from the previous example, the statements for performing one step of this iteration are as follows:

ods output SolutionF=sf1;
proc mixed;
   class a;
   model y = a x / s;
   repeated / local=pom(sf);
run;

proc compare brief data=sf compare=sf1;
   var estimate;
run;

data sf;
   set sf1;
run;

Unfortunately, this iterative process does not always converge. For further details, see the description of pseudo-likelihood in Chapter 3 of Carroll and Ruppert (1988).

LOCALW

specifies that only the local effects and no others be weighted. By default, all effects are weighted. The LOCALW option is used in connection with the WEIGHT statement and the LOCAL option in the REPEATED statement.

NONLOCALW

specifies that only the nonlocal effects and no others be weighted. By default, all effects are weighted. The NONLOCALW option is used in connection with the WEIGHT statement and the LOCAL option in the REPEATED statement.

R<=value-list>

requests that blocks of the estimated $\mb {R}$ matrix be displayed. The first block determined by the SUBJECT= effect is the default displayed block. PROC MIXED displays blanks for value-lists that are 0.

The value-list indicates the subjects for which blocks of $\mb {R}$ are to be displayed. For example, the following statement displays block matrices for the first, third, and fifth persons:

repeated / type=cs subject=person r=1,3,5;

See the PARMS statement for the possible forms of value-list. The ODS table name is R.

RC<=value-list>

produces the Cholesky root of blocks of the estimated $\mb {R}$ matrix. The value-list specification is the same as with the R option. The ODS table name is CholR.

RCI<=value-list>

produces the inverse Cholesky root of blocks of the estimated $\mb {R}$ matrix. The value-list specification is the same as with the R option. The ODS table name is InvCholR.

RCORR<=value-list>

produces the correlation matrix corresponding to blocks of the estimated $\mb {R}$ matrix. The value-list specification is the same as with the R option. The ODS table name is RCorr.

RI<=value-list>

produces the inverse of blocks of the estimated $\mb {R}$ matrix. The value-list specification is the same as with the R option. The ODS table name is InvR.

SSCP

requests that an unstructured $\mb {R}$ matrix be estimated from the sum-of-squares-and-crossproducts matrix of the residuals. It applies only when you specify TYPE=UN and have no RANDOM statements. Also, you must have a sufficient number of subjects for the estimate to be positive definite.

This option is useful when the size of the blocks of $\mb {R}$ is large (for example, greater than 10) and you want to use or inspect an unstructured estimate that is much quicker to compute than the default REML estimate. The two estimates will agree for certain balanced data sets when you have a classification fixed effect defined across all time points within a subject.

SUBJECT=effect SUB=effect

identifies the subjects in your mixed model. Complete independence is assumed across subjects; therefore, the SUBJECT= option produces a block-diagonal structure in $\mb {R}$ with identical blocks. When the SUBJECT= effect consists entirely of classification variables, the blocks of $\mb {R}$ correspond to observations sharing the same level of that effect. These blocks are sorted according to this effect as well.

Continuous variables are permitted as arguments to the SUBJECT= option. PROC MIXED does not sort by the values of the continuous variable; rather, it considers the data to be from a new subject or group whenever the value of the continuous variable changes from the previous observation. Using a continuous variable decreases execution time for models with a large number of subjects or groups and also prevents the production of a large “Class Level Information” table.

If you want to model nonzero covariance among all of the observations in your SAS data set, specify SUBJECT=INTERCEPT to treat the data as if they are all from one subject. However, be aware that in this case PROC MIXED manipulates an $\mb {R}$ matrix with dimensions equal to the number of observations. If no SUBJECT= effect is specified, then every observation is assumed to be from a different subject and $\mb {R}$ is assumed to be diagonal. For this reason, you usually want to use the SUBJECT= option in the REPEATED statement.

TYPE=covariance-structure

specifies the covariance structure of the $\mb {R}$ matrix. The SUBJECT= option defines the blocks of $\mb {R}$ , and the TYPE= option specifies the structure of these blocks. Valid values for covariance-structure and their descriptions are provided in Table 63.17 and Table 63.18. The default structure is VC.

Table 63.17: Covariance Structures

Structure	Description	Parms	$(i,j)$ element
ANTE(1)	Antedependence	$2t-1$	$\sigma _{i}\sigma _{j}\prod _{k=i}^{j-1}\rho _ k$
AR(1)	Autoregressive(1)	2	$\sigma ^2\rho ^{\|i-j\|}$
ARH(1)	Heterogeneous AR(1)	$t+1$	$\sigma _{i}\sigma _{j}\rho ^{\|i-j\|}$
ARMA(1,1)	ARMA(1,1)	3	$\sigma ^2[\gamma \rho ^{\|i-j\|-1} 1(i\neq j) + 1(i=j)]$
CS	Compound symmetry	2	$\sigma _1 + \sigma ^2 1(i=j)$
CSH	Heterogeneous CS	$t+1$	$\sigma _{i}\sigma _{j}[\rho 1(i \neq j)+ 1(i=j)]$
FA(q)	Factor analytic	$\frac{q}{2}(2t -q + 1) + t$	$\Sigma _{k=1}^{\min (i,j,q)} \lambda _{ik}\lambda _{jk}+\sigma ^2_ i 1(i=j)$
FA0(q)	No diagonal FA	$\frac{q}{2}(2t -q + 1)$	$\Sigma _{k=1}^{\min (i,j,q)} \lambda _{ik}\lambda _{jk}$
FA1(q)	Equal diagonal FA	$\frac{q}{2}(2t -q + 1) + 1$	$\Sigma _{k=1}^{\min (i,j,q)} \lambda _{ik}\lambda _{jk}+\sigma ^2 1(i=j)$
HF	Huynh-Feldt	$t+1$	$(\sigma _{i}^{2}+\sigma _{j}^{2})/2+\lambda 1(i\neq j)$
LIN(q)	General linear	q	$\Sigma _{k=1}^ q \theta _ k \mb {A} _{ij}$
TOEP	Toeplitz	t	$\sigma _{\|i-j\|+1}$
TOEP(q)	Banded Toeplitz	q	$\sigma _{\|i-j\|+1} 1(\|i-j\|<q)$
TOEPH	Heterogeneous TOEP	$2t-1$	$\sigma _{i}\sigma _{j}\rho _{\|i-j\|}$
TOEPH(q)	Banded hetero TOEP	$t+q-1$	$\sigma _{i}\sigma _{j}\rho _{\|i-j\|} 1(\|i-j\|<q)$
UN	Unstructured	$t(t+1)/2$	$\sigma _{ij}$
UN(q)	Banded	$\frac{q}{2}(2t-q+1)$	$\sigma _{ij} 1(\|i-j\|<q)$
UNR	Unstructured corrs	$t(t+1)/2$	$\sigma _{i}\sigma _{j}\rho _{\max (i,j)\min (i,j)}$
UNR(q)	Banded correlations	$\frac{q}{2}(2t-q+1)$	$\sigma _{i}\sigma _{j}\rho _{\max (i,j)\min (i,j)}$
UN@AR(1)	Direct product AR(1)	$t_1(t_1+1)/2 + 1$	$\sigma _{{i_1}{j_1}}\rho ^{\|{i_2}-{j_2}\|}$
UN@CS	Direct product CS	$t_1(t_1+1)/2 + 1$	$\left\{ \begin{array}{ll} \sigma _{{i_1}{j_1}} & i_2 = j_2 \\ \sigma ^2\sigma _{{i_1}{j_1}} & i_2 \not= j_2 \\ 0 \le \sigma ^2 \le 1 & \end{array}\right.$
UN@UN	Direct product UN	$t_1(t_1+1)/2\ +$	$\sigma _{1,{i_1}{j_1}}\sigma _{2,{i_2}{j_2}}$
		$t_2(t_2+1)/2 - 1$
VC	Variance components	q	$\sigma ^2_ k 1(i=j)$
			and i corresponds to kth effect

In Table 63.17, “Parms” is the number of covariance parameters in the structure, t is the overall dimension of the covariance matrix, and $1(A)$ equals 1 when A is true and 0 otherwise. For example, 1 $(i=j)$ equals 1 when $i=j$ and 0 otherwise, and 1 $(|i-j|<q)$ equals 1 when $|i-j|<q$ and 0 otherwise. For the TYPE=TOEPH structures, $\rho _0 = 1$ , and for the TYPE=UNR structures, $\rho _{ii} = 1$ for all i. For the direct product structures, the subscripts “1” and “2” see the first and second structure in the direct product, respectively, and $i_1=\mr {int}((i+t_2-1)/t_2)$ , $j_1=\mr {int}((j+t_2-1)/t_2)$ , $i_2=\mr {mod}(i-1,t_2)+1$ , and $j_2=\mr {mod}(j-1,t_2)+1$ .

Table 63.18: Spatial Covariance Structures

Structure	Description	Parms	$(i,j)$ element
SP(EXP)(c-list )	Exponential	2	$\sigma ^2 \, \exp \{ -d_{ij}/\theta \}$
SP(EXPA)(c-list )	Anisotropic exponential	$2c + 1$	$\sigma ^2 \prod _{k=1}^ c \exp \{ -\theta _ k d(i,j,k)^{p_ k}\}$
SP(EXPGA)( $\Argument{c}_1 \, \Argument{c}_2$ )	2D exponential,	4	$\sigma ^2 \, \exp \{ -d_{ij}(\theta ,\lambda )/\rho \}$
	geometrically anisotropic
SP(GAU)(c-list )	Gaussian	2	$\sigma ^2 \exp \{ -d^2_{ij}/\rho ^2\}$
SP(GAUGA)( $\Argument{c}_1 \, \Argument{c}_2$ )	2D Gaussian,	4	$\sigma ^2 \exp \{ -d_{ij}(\theta ,\lambda )^2/\rho ^2\}$
	geometrically anisotropic
SP(LIN)(c-list )	Linear	2	$\sigma ^2(1 - \rho d_{ij})\ 1(\rho d_{ij} \leq 1)$
SP(LINL)(c-list )	Linear log	2	$\sigma ^2(1 - \rho \log (d_{ij}))$
			$\, \, \, \, \times 1(\rho \log (d_{ij}) \leq 1,d_{ij} > 0 )$
SP(MATERN)(c-list)	Matérn	3	$\sigma ^2 \frac{1}{\Gamma (\nu )} \left(\frac{d_{ij}}{2\rho }\right)^\nu 2 K_\nu (d_{ij}/\rho )$
SP(MATHSW)(c-list)	Matérn	3	$\sigma ^2\frac{1}{\Gamma (\nu )} \left(\frac{d_{ij}\sqrt {\nu }}{\rho }\right)^{\nu } 2 K_{\nu }\left(\frac{2d_{ij}\sqrt {\nu }}{\rho }\right)$
	(Handcock-Stein-Wallis)
SP(POW)(c-list)	Power	2	$\sigma ^2 \rho ^{d_{ij}}$
SP(POWA)(c-list)	Anisotropic power	$c+1$	$\sigma ^2 \rho ^{d(i,j,1)}_{1} \rho ^{d(i,j,2)}_{2} \ldots \rho ^{d(i,j,c)}_{c}$
SP(SPH)(c-list )	Spherical	2	$\sigma ^2[1 - (\frac{3d_{ij}}{2\rho }) + (\frac{d^3_{ij}}{2\rho ^3})]\ 1(d_{ij} \leq \rho )$
SP(SPHGA)( $\Argument{c}_1 \, \Argument{c}_2$ )	2D Spherical,	4	$\sigma ^2[1 - (\frac{3d_{ij}(\theta ,\lambda )}{2\rho }) + (\frac{d_{ij}(\theta ,\lambda )^3}{2\rho ^3})]$
	geometrically anisotropic		$\, \, \, \times 1(d_{ij}(\theta ,\lambda ) \leq \rho )$

In Table 63.18, c-list contains the names of the numeric variables used as coordinates of the location of the observation in space, and $d_{ij}$ is the Euclidean distance between the ith and jth vectors of these coordinates, which correspond to the ith and jth observations in the input data set. For SP(POWA) and SP(EXPA), c is the number of coordinates, and $d(i,j,k)$ is the absolute distance between the kth coordinate, $k = 1,\ldots ,c$ , of the ith and jth observations in the input data set. For the geometrically anisotropic structures SP(EXPGA), SP(GAUGA), and SP(SPHGA), exactly two spatial coordinate variables must be specified as $\Argument{c}_1$ and $\Argument{c}_2$ . Geometric anisotropy is corrected by applying a rotation $\theta$ and scaling $\lambda$ to the coordinate system, and $d_{ij}(\theta ,\lambda )$ represents the Euclidean distance between two points in the transformed space. SP(MATERN) and SP(MATHSW) represent covariance structures in a class defined by Matérn (see Matérn 1986; Handcock and Stein 1993; Handcock and Wallis 1994). The function $K_{\nu }$ is the modified Bessel function of the second kind of (real) order $\nu > 0$ ; the parameter $\nu$ governs the smoothness of the process (see below for more details).

Table 63.19 lists some examples of the structures in Table 63.17 and Table 63.18.

Table 63.19: Covariance Structure Examples

Description	Structure	Example
Variance components	VC (default)	$\begin{bmatrix} \sigma _{B}^{2} & 0 & 0 & 0 \\ 0 & \sigma _{B}^{2} & 0 & 0 \\ 0 & 0 & \sigma _{AB}^{2} & 0 \\ 0 & 0 & 0 & \sigma _{AB}^{2} \end{bmatrix}$
Compound symmetry	CS	$\begin{bmatrix} \sigma ^2 + \sigma _1 & \sigma _1 & \sigma _1 & \sigma _1 \\ \sigma _1 & \sigma ^2 + \sigma _1 & \sigma _1 & \sigma _1 \\ \sigma _1 & \sigma _1 & \sigma ^2 + \sigma _1 & \sigma _1 \\ \sigma _1 & \sigma _1 & \sigma _1 & \sigma ^2 + \sigma _1 \end{bmatrix}$
Unstructured	UN	$\begin{bmatrix} \sigma ^{2}_{1} & \sigma _{21} & \sigma _{31} & \sigma _{41} \\ \sigma _{21} & \sigma ^{2}_{2} & \sigma _{32} & \sigma _{42} \\ \sigma _{31} & \sigma _{32} & \sigma ^{2}_{3} & \sigma _{43} \\ \sigma _{41} & \sigma _{42} & \sigma _{43} & \sigma ^{2}_{4} \end{bmatrix}$
Banded main diagonal	UN(1)	$\begin{bmatrix} \sigma ^2_1 & 0 & 0 & 0 \\ 0 & \sigma ^2_2 & 0 & 0 \\ 0 & 0 & \sigma ^2_3 & 0 \\ 0 & 0 & 0 & \sigma ^2_4 \end{bmatrix}$
First-order autoregressive	AR(1)	$\sigma ^2\begin{bmatrix} 1 & \rho & \rho ^2 & \rho ^3 \\ \rho & 1 & \rho & \rho ^2 \\ \rho ^2 & \rho & 1 & \rho \\ \rho ^3 & \rho ^2 & \rho & 1 \end{bmatrix}$
Toeplitz	TOEP	$\begin{bmatrix} \sigma ^2 & \sigma _1 & \sigma _2 & \sigma _3 \\ \sigma _1 & \sigma ^2 & \sigma _1 & \sigma _2 \\ \sigma _2 & \sigma _1 & \sigma ^2 & \sigma _1 \\ \sigma _3 & \sigma _2 & \sigma _1 & \sigma ^2 \end{bmatrix}$
Toeplitz with two bands	TOEP(2)	$\begin{bmatrix} \sigma ^2 & \sigma _1 & 0 & 0 \\ \sigma _1 & \sigma ^2 & \sigma _1 & 0 \\ 0 & \sigma _1 & \sigma ^2 & \sigma _1 \\ 0 & 0 & \sigma _1 & \sigma ^2 \end{bmatrix}$
Spatial power	SP(POW)(c)	$\sigma ^2\begin{bmatrix} 1 & \rho ^{d_{12}} & \rho ^{d_{13}} & \rho ^{d_{14}} \\ \rho ^{d_{21}} & 1 & \rho ^{d_{23}} & \rho ^{d_{24}} \\ \rho ^{d_{31}} & \rho ^{d_{32}} & 1 & \rho ^{d_{34}} \\ \rho ^{d_{41}} & \rho ^{d_{42}} & \rho ^{d_{43}} & 1 \end{bmatrix}$
Heterogeneous AR(1)	ARH(1)	$\begin{bmatrix} \sigma _{1}^{2} & \sigma _{1}\sigma _{2}\rho & \sigma _{1}\sigma _{3}\rho ^{2} & \sigma _{1}\sigma _{4}\rho ^{3} \\ \sigma _{2}\sigma _{1}\rho & \sigma _{2}^{2} & \sigma _{2}\sigma _{3}\rho & \sigma _{2}\sigma _{4}\rho ^{2} \\ \sigma _{3}\sigma _{1}\rho ^{2} & \sigma _{3}\sigma _{2}\rho & \sigma _{3}^{2} & \sigma _{3}\sigma _{4}\rho \\ \sigma _{4}\sigma _{1}\rho ^{3} & \sigma _{4}\sigma _{2}\rho & \sigma _{4}\sigma _{3}\rho & \sigma _{4}^{2} \end{bmatrix}$
First-order autoregressive moving average	ARMA(1,1)	$\sigma ^2\begin{bmatrix} 1 & \gamma & \gamma \rho & \gamma \rho ^{2} \\ \gamma & 1 & \gamma & \gamma \rho \\ \gamma \rho & \gamma & 1 & \gamma \\ \gamma \rho ^{2} & \gamma \rho & \gamma & 1 \end{bmatrix}$
Heterogeneous CS	CSH	$\begin{bmatrix} \sigma _{1}^{2} & \sigma _{1}\sigma _{2}\rho & \sigma _{1}\sigma _{3}\rho & \sigma _{1}\sigma _{4}\rho \\ \sigma _{2}\sigma _{1}\rho & \sigma _{2}^{2} & \sigma _{2}\sigma _{3}\rho & \sigma _{2}\sigma _{4}\rho \\ \sigma _{3}\sigma _{1}\rho & \sigma _{3}\sigma _{2}\rho & \sigma _{3}^{2} & \sigma _{3}\sigma _{4}\rho \\ \sigma _{4}\sigma _{1}\rho & \sigma _{4}\sigma _{2}\rho & \sigma _{4}\sigma _{3}\rho & \sigma _{4}^{2} \end{bmatrix}$
First-order factor analytic	FA(1)	$\begin{bmatrix} \lambda _{1}^{2} + d_{1} & \lambda _{1}\lambda _{2} & \lambda _{1}\lambda _{3} & \lambda _{1}\lambda _{4} \\ \lambda _{2}\lambda _{1} & \lambda _{2}^{2} + d_{2} & \lambda _{2}\lambda _{3} & \lambda _{2}\lambda _{4} \\ \lambda _{3}\lambda _{1} & \lambda _{3}\lambda _{2} & \lambda _{3}^{2} + d_{3} & \lambda _{3}\lambda _{4} \\ \lambda _{4}\lambda _{1} & \lambda _{4}\lambda _{2} & \lambda _{4}\lambda _{3} & \lambda _{4}^{2} + d_{4} \end{bmatrix}$
Huynh-Feldt	HF	$\begin{bmatrix} \sigma _{1}^{2} & \frac{\sigma _{1}^{2}+\sigma _{2}^{2}}{2}-\lambda & \frac{\sigma _{1}^{2}+\sigma _{3}^{2}}{2}-\lambda \\ \frac{\sigma _{2}^{2}+\sigma _{1}^{2}}{2}-\lambda & \sigma _{2}^{2} & \frac{\sigma _{2}^{2}+\sigma _{3}^{2}}{2}-\lambda \\ \frac{\sigma _{3}^{2}+\sigma _{1}^{2}}{2}-\lambda & \frac{\sigma _{3}^{2}+\sigma _{2}^{2}}{2}-\lambda & \sigma _{3}^{2} \end{bmatrix}$
First-order antedependence	ANTE(1)	$\begin{bmatrix} \sigma ^2_1 & \sigma _1 \sigma _2 \rho _1 & \sigma _1 \sigma _3 \rho _1 \rho _2 \\ \sigma _2 \sigma _1 \rho _1 & \sigma ^2_2 & \sigma _2 \sigma _3 \rho _2 \\ \sigma _3 \sigma _1 \rho _2 \rho _1 & \sigma _3 \sigma _2 \rho _2 & \sigma ^2_3 \\ \end{bmatrix}$
Heterogeneous Toeplitz	TOEPH	$\begin{bmatrix} \sigma ^2_1 & \sigma _1 \sigma _2 \rho _1 & \sigma _1 \sigma _3 \rho _2 & \sigma _1 \sigma _4 \rho _3 \\ \sigma _2 \sigma _1 \rho _1 & \sigma ^2_2 & \sigma _2 \sigma _3 \rho _1 & \sigma _2 \sigma _4 \rho _2 \\ \sigma _3 \sigma _1 \rho _2 & \sigma _3 \sigma _2 \rho _1 & \sigma ^2_3 & \sigma _3 \sigma _4 \rho _1 \\ \sigma _4 \sigma _1 \rho _3 & \sigma _4 \sigma _2 \rho _2 & \sigma _4 \sigma _3 \rho _1 & \sigma ^2_4 \\ \end{bmatrix}$
Unstructured correlations	UNR	$\begin{bmatrix} \sigma ^2_1 & \sigma _1 \sigma _2 \rho _{21} & \sigma _1 \sigma _3 \rho _{31} & \sigma _1 \sigma _4 \rho _{41} \\ \sigma _2 \sigma _1 \rho _{21} & \sigma ^2_2 & \sigma _2 \sigma _3 \rho _{32} & \sigma _2 \sigma _4 \rho _{42} \\ \sigma _3 \sigma _1 \rho _{31} & \sigma _3 \sigma _2 \rho _{32} & \sigma ^2_3 & \sigma _3 \sigma _4 \rho _{43} \\ \sigma _4 \sigma _1 \rho _{41} & \sigma _4 \sigma _2 \rho _{42} & \sigma _4 \sigma _3 \rho _{43} & \sigma ^2_4 \\ \end{bmatrix}$
Direct product AR(1)	UN@AR(1)	$\begin{bmatrix} \sigma ^2_{1} & \sigma _{21} \\ \sigma _{21} & \sigma ^2_{2} \end{bmatrix} \otimes \begin{bmatrix} 1 & \rho & \rho ^2 \\ \rho & 1 & \rho \\ \rho ^2 & \rho & 1 \end{bmatrix} =$
		$\begin{bmatrix} \sigma ^2_{1} & \sigma ^2_{1}\rho & \sigma ^2_{1} \rho ^2 & \sigma _{21} & \sigma _{21}\rho & \sigma _{21} \rho ^2 \\ \sigma ^2_{1} \rho & \sigma ^2_{1} & \sigma ^2_{1} \rho & \sigma _{21} \rho & \sigma _{21} & \sigma _{21} \rho \\ \sigma ^2_{1} \rho ^2 & \sigma ^2_{1} \rho & \sigma ^2_{1} & \sigma _{21} \rho ^2 & \sigma _{21} \rho & \sigma _{21} \\ \sigma _{21} & \sigma _{21} \rho & \sigma _{21} \rho ^2 & \sigma ^2_{2} & \sigma ^2_{2} \rho & \sigma ^2_{2} \rho ^2 \\ \sigma _{21} \rho & \sigma _{21} & \sigma _{21} \rho & \sigma ^2_{2} \rho & \sigma ^2_{2} & \sigma ^2_{2} \rho \\ \sigma _{21} \rho ^2 & \sigma _{21} \rho & \sigma _{21} & \sigma ^2_{2} \rho ^2 & \sigma ^2_{2} \rho & \sigma ^2_{2} \\ \end{bmatrix}$

The following provides some further information about these covariance structures:

ANTE(1)

specifies the first-order antedependence structure (see Kenward 1987; Patel 1991; Macchiavelli and Arnold 1994). In Table 63.17, $\sigma ^2_ i$ is the ith variance parameter, and $\rho _ k$ is the kth autocorrelation parameter satisfying $|\rho _ k| < 1$ .

AR(1)

specifies a first-order autoregressive structure. PROC MIXED imposes the constraint $|\rho | < 1$ for stationarity.

ARH(1)

specifies a heterogeneous first-order autoregressive structure. As with TYPE=AR(1), PROC MIXED imposes the constraint $|\rho | < 1$ for stationarity.

ARMA(1,1)

specifies the first-order autoregressive moving-average structure. In Table 63.17, $\rho$ is the autoregressive parameter, $\gamma$ models a moving-average component, and $\sigma ^2$ is the residual variance. In the notation of Fuller (1976, p. 68), $\rho = \theta _1$ and

$\gamma = \frac{(1 + b_1\theta _1)(\theta _1 + b_1)}{1 + b^2_1 + 2 b_1 \theta _1}$

The example in Table 63.19 and $|b_1| < 1$ imply that

$b_1 = \frac{\beta - \sqrt {\beta ^2 - 4\alpha ^2}}{2\alpha }$

where $\alpha = \gamma - \rho$ and $\beta = 1 + \rho ^2 - 2\gamma \rho$ . PROC MIXED imposes the constraints $|\rho | < 1$ and $|\gamma | < 1$ for stationarity, although for some values of $\rho$ and $\gamma$ in this region the resulting covariance matrix is not positive definite. When the estimated value of $\rho$ becomes negative, the computed covariance is multiplied by $\cos (\pi d_{ij})$ to account for the negativity.

CS

specifies the compound-symmetry structure, which has constant variance and constant covariance.

CSH

specifies the heterogeneous compound-symmetry structure. This structure has a different variance parameter for each diagonal element, and it uses the square roots of these parameters in the off-diagonal entries. In Table 63.17, $\sigma ^2_ i$ is the ith variance parameter, and $\rho$ is the correlation parameter satisfying $|\rho | < 1$ .

FA(q)

specifies the factor-analytic structure with q factors (Jennrich and Schluchter, 1986). This structure is of the form $\bLambda \bLambda ’ + \bD$ , where $\bLambda$ is a $t \times q$ rectangular matrix and $\bD$ is a $t \times t$ diagonal matrix with t different parameters. When q > 1, the elements of $\bLambda$ in its upper-right corner (that is, the elements in the ith row and jth column for j > i) are set to zero to fix the rotation of the structure.

FA0(q)

is similar to the FA(q) structure except that no diagonal matrix $\bD$ is included. When q < t—that is, when the number of factors is less than the dimension of the matrix—this structure is nonnegative definite but not of full rank. In this situation, you can use it for approximating an unstructured $\bG$ matrix in the RANDOM statement or for combining with the LOCAL option in the REPEATED statement. When q = t, you can use this structure to constrain $\bG$ to be nonnegative definite in the RANDOM statement.

FA1(q)

is similar to the TYPE=FA(q) structure except that all of the elements in $\bD$ are constrained to be equal. This offers a useful and more parsimonious alternative to the full factor-analytic structure.

HF

specifies the Huynh-Feldt covariance structure (Huynh and Feldt, 1970). This structure is similar to the TYPE=CSH structure in that it has the same number of parameters and heterogeneity along the main diagonal. However, it constructs the off-diagonal elements by taking arithmetic rather than geometric means.

You can perform a likelihood ratio test of the Huynh-Feldt conditions by running PROC MIXED twice, once with TYPE=HF and once with TYPE=UN, and then subtracting their respective values of –2 times the maximized likelihood.

If PROC MIXED does not converge under your Huynh-Feldt model, you can specify your own starting values with the PARMS statement. The default MIVQUE(0) starting values can sometimes be poor for this structure. A good choice for starting values is often the parameter estimates corresponding to an initial fit that uses TYPE=CS.

LIN(q)

specifies the general linear covariance structure with q parameters. This structure consists of a linear combination of known matrices that are input with the LDATA= option. This structure is very general, and you need to make sure that the variance matrix is positive definite. By default, PROC MIXED sets the initial values of the parameters to 1. You can use the PARMS statement to specify other initial values.

LINEAR(q)

is an alias for TYPE=LIN(q).

SIMPLE

is an alias for TYPE=VC.

SP(EXPA)(c-list)

specifies the spatial anisotropic exponential structure, where c-list is a list of variables indicating the coordinates. This structure has $(i,j)$ element equal to

$\sigma ^2 \prod _{k=1}^ c \exp \{ -\theta _ k d(i,j,k)^{p_ k}\}$

where c is the number of coordinates and $d(i,j,k)$ is the absolute distance between the kth coordinate ( $k=1,\ldots ,c$ ) of the ith and jth observations in the input data set. There are 2c + 1 parameters to be estimated: $\theta _ k$ , $p_ k$ ( $k=1,\ldots ,c$ ), and $\sigma ^2$ .

You might want to constrain some of the EXPA parameters to known values. For example, suppose you have three coordinate variables C1, C2, and C3 and you want to constrain the powers $p_ k$ to equal 2, as in Sacks et al. (1989). Suppose further that you want to model covariance across the entire input data set and you suspect the $\theta _ k$ and $\sigma ^2$ estimates are close to 3, 4, 5, and 1, respectively. Then specify the following statements:

repeated / type=sp(expa)(c1 c2 c3)
   subject=intercept;
parms (3) (4) (5) (2) (2) (2) (1) /
   hold=4,5,6;

SP(EXPGA)( $\Argument{c}_1 \, \, \Argument{c}_2$ )

specify modification of the isotropic SP(EXP) covariance structure.

SP(GAUGA)( $\Argument{c}_1 \, \, \Argument{c}_2$ )

specify modification of the isotropic SP(GAU) covariance structure.

SP(SPHGA)( $\Argument{c}_1 \, \, \Argument{c}_2$ )

specify modification of the isotropic SP(SPH) covariance structure.

These are structures that allow for geometric anisotropy in two dimensions. The coordinates are specified by the variables c1 and c2.

If the spatial process is geometrically anisotropic in $\mb {c}=[c_{i1}, c_{i2}]$ , then it is isotropic in the coordinate system

$\mb {Ac} = \left[ \begin{array}{cc} 1 & 0 \\ 0 & \lambda \end{array}\right] \left[ \begin{array}{cc} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{array} \right] \mb {c} = \mb {c}^*$

for a properly chosen angle $\theta$ and scaling factor $\lambda$ . Elliptical isocorrelation contours are thereby transformed to spherical contours, adding two parameters to the respective isotropic covariance structures. Euclidean distances (see Table 63.18) are expressed in terms of $\mb {c}^*$ .

The angle $\theta$ of the clockwise rotation is reported in radians, $0 \leq \theta \leq 2\pi$ . The scaling parameter $\lambda$ represents the ratio of the range parameters in the direction of the major and minor axis of the correlation contours. In other words, following a rotation of the coordinate system by angle $\theta$ , isotropy is achieved by compressing or magnifying distances in one coordinate by the factor $\lambda$ .

Fixing $\lambda =1.0$ reduces the models to isotropic ones for any angle of rotation. If the scaling parameter is held constant at 1.0, you should also hold constant the angle of rotation, as in the following statements:

repeated / type=sp(expga)(gxc gyc)
           subject=intercept;
parms (6) (1.0) (0.0) (1) / hold=2,3;

If $\lambda$ is fixed at any other value than 1.0, the angle of rotation can be estimated. Specifying a starting grid of angles and scaling factors can considerably improve the convergence properties of the optimization algorithm for these models. Only a single random effect with geometrically anisotropic structure is permitted.

SP(MATERN)(c-list ) | SP(MATHSW)(c-list )

specifies covariance structures in the Matérn class of covariance functions (Matérn, 1986). Two observations for the same subject (block of $\mb {R}$ ) that are Euclidean distance $d_{ij}$ apart have covariance

$\sigma ^2 \frac{1}{\Gamma (\nu )} \left(\frac{d_{ij}}{2\rho }\right)^\nu 2 K_\nu (d_{ij}/\rho ) \qquad \nu > 0, \, \, \rho > 0$

where $K_\nu$ is the modified Bessel function of the second kind of (real) order $\nu > 0$ . The smoothness (continuity) of a stochastic process with covariance function in this class increases with $\nu$ . The Matérn class thus enables data-driven estimation of the smoothness properties. The covariance is identical to the exponential model for $\nu = 0.5$ (TYPE=SP(EXP)(c-list)), while for $\nu =1$ the model advocated by Whittle (1954) results. As $\nu \rightarrow \infty$ the model approaches the gaussian covariance structure (TYPE=SP(GAU)(c-list)).

The MATHSW structure represents the Matérn class in the parameterization of Handcock and Stein (1993) and Handcock and Wallis (1994),

$\sigma ^2\frac{1}{\Gamma (\nu )} \left(\frac{d_{ij}\sqrt {\nu }}{\rho }\right)^{\nu } 2 K_{\nu }\left(\frac{2d_{ij}\sqrt {\nu }}{\rho }\right)$

Since computation of the function $K_{\nu }$ and its derivatives is numerically very intensive, fitting models with Matérn covariance structures can be more time-consuming than with other spatial covariance structures. Good starting values are essential.

SP(POW)(c-list) | SP(POWA)(c-list)

specifies the spatial power structures. When the estimated value of $\rho$ becomes negative, the computed covariance is multiplied by $\cos (\pi d_{ij})$ to account for the negativity.

TOEP<(q)>

specifies a banded Toeplitz structure. This can be viewed as a moving-average structure with order equal to $\Argument{q}-1$ . The TYPE=TOEP option is a full Toeplitz matrix, which can be viewed as an autoregressive structure with order equal to the dimension of the matrix. The specification TYPE=TOEP(1) is the same as $\sigma ^2 I$ , where I is an identity matrix, and it can be useful for specifying the same variance component for several effects.

TOEPH<(q)>

specifies a heterogeneous banded Toeplitz structure. In Table 63.17, $\sigma ^2_ i$ is the ith variance parameter and $\rho _ j$ is the jth correlation parameter satisfying $|\rho _ j| < 1$ . If you specify the order parameter q, then PROC MIXED estimates only the first q bands of the matrix, setting all higher bands equal to 0. The option TOEPH(1) is equivalent to both the TYPE=UN(1) and TYPE=UNR(1) options.

UN<(q)>

specifies a completely general (unstructured) covariance matrix parameterized directly in terms of variances and covariances. The variances are constrained to be nonnegative, and the covariances are unconstrained. This structure is not constrained to be nonnegative definite in order to avoid nonlinear constraints; however, you can use the TYPE=FA0 structure if you want this constraint to be imposed by a Cholesky factorization. If you specify the order parameter q, then PROC MIXED estimates only the first q bands of the matrix, setting all higher bands equal to 0.

UNR<(q)>

specifies a completely general (unstructured) covariance matrix parameterized in terms of variances and correlations. This structure fits the same model as the TYPE=UN(q) option but with a different parameterization. The ith variance parameter is $\sigma ^2_ i$ . The parameter $\rho _{jk}$ is the correlation between the jth and kth measurements; it satisfies $|\rho _{jk}| < 1$ . If you specify the order parameter r, then PROC MIXED estimates only the first q bands of the matrix, setting all higher bands equal to zero.

UN@AR(1) | UN@CS | UN@UN

specify direct (Kronecker) product structures designed for multivariate repeated measures (see Galecki 1994). These structures are constructed by taking the Kronecker product of an unstructured matrix (modeling covariance across the multivariate observations) with an additional covariance matrix (modeling covariance across time or another factor). The upper-left value in the second matrix is constrained to equal 1 to identify the model. See the SAS/IML User's Guide for more details about direct products.

To use these structures in the REPEATED statement, you must specify two distinct REPEATED effects, both of which must be included in the CLASS statement. The first effect indicates the multivariate observations, and the second identifies the levels of time or some additional factor. Note that the input data set must still be constructed in “univariate” format; that is, all dependent observations are still listed observation-wise in one single variable. Although this construction provides for general modeling possibilities, it forces you to construct variables indicating both dimensions of the Kronecker product.

For example, suppose your observed data consist of heights and weights of several children measured over several successive years. Your input data set should then contain variables similar to the following:

Y, all of the heights and weights, with a separate observation for each
Var, indicating whether the measurement is a height or a weight
Year, indicating the year of measurement
Child, indicating the child on which the measurement was taken

Your PROC MIXED statements for a Kronecker AR(1) structure across years would then be as follows:

proc mixed;
   class Var Year Child;
   model Y = Var Year Var*Year;
   repeated Var Year / type=un@ar(1)
                       subject=Child;
run;

You should nearly always want to model different means for the multivariate observations; hence the inclusion of Var in the MODEL statement. The preceding mean model consists of cell means for all combinations of VAR and YEAR.

VC

specifies standard variance components and is the default structure for both the RANDOM and REPEATED statements. In the RANDOM statement, a distinct variance component is assigned to each effect. In the REPEATED statement, this structure is usually used only with the GROUP= option to specify a heterogeneous variance model.

Jennrich and Schluchter (1986) provide general information about the use of covariance structures, and Wolfinger (1996) presents details about many of the heterogeneous structures. Modeling with spatial covariance structures is discussed in many sources (Marx and Thompson, 1987; Zimmerman and Harville, 1991; Cressie, 1993; Brownie, Bowman, and Burton, 1993; Stroup, Baenziger, and Mulitze, 1994; Brownie and Gumpertz, 1997; Gotway and Stroup, 1997; Chilès and Delfiner, 1999; Schabenberger and Gotway, 2005; Littell et al., 2006).