The GLIMMIX Procedure

Residual-Based Estimators

The GLIMMIX procedure can compute the classical sandwich estimator of the covariance matrix of the fixed effects, as well as several bias-adjusted estimators. This requires that the model is either an (overdispersed) GLM or a GLMM that can be processed by subjects (see the section Processing by Subjects).

Consider a statistical model of the form

\[ \mb{Y} = \bmu + \bepsilon , \qquad \bepsilon \sim (\mb{0},\bSigma ) \]

The general expression of a sandwich covariance estimator is then

\[ c \times \widehat{\bOmega } \left( \sum _{i=1}^ m \mb{A}_ i \widehat{\mb{D}}_ i’ \widehat{\bSigma }_ i^{-1} \mb{F}_ i’ \mb{e}_ i\mb{e}_ i’ \mb{F}_ i \widehat{\bSigma }_ i^{-1} \widehat{\mb{D}}_ i \mb{A}_ i \right) \widehat{\bOmega } \]

where $\mb{e}_ i = \mb{y}_ i - \widehat{\bmu }_ i$, $\bOmega = (\mb{D}’\bSigma ^{-1}\mb{D})^{-}$.

For a GLMM estimated by one of the pseudo-likelihood techniques that involve linearization, you can make the following substitutions: $\mb{Y} \rightarrow \mb{P}$, $\bSigma \rightarrow \mb{V}(\btheta )$, $\mb{D} \rightarrow \mb{X}$, $\widehat{\bmu } \rightarrow \mb{X}\widehat{\bbeta }$. These matrices are defined in the section Pseudo-likelihood Estimation Based on Linearization.

The various estimators computed by the GLIMMIX procedure differ in the choice of the constant c and the matrices $\mb{F}_ i$ and $\mb{A}_ i$. You obtain the classical estimator, for example, with c = 1, and $\mb{F}_ i = \mb{A}_ i$ equal to the identity matrix.

The EMPIRICAL= ROOT estimator of Kauermann and Carroll (2001) is based on the approximation

\[ \mr{Var}\left[ \mb{e}_ i\mb{e}_ i’ \right] \approx (\mb{I} - \mb{H}_ i)\bSigma _ i \]

where $\mb{H}_ i = \mb{D}_ i\bOmega \mb{D}_ i’\bSigma _ i^{-1}$. The EMPIRICAL= FIRORES estimator is based on the approximation

\[ \mr{Var}\left[ \mb{e}_ i\mb{e}_ i’ \right] \approx (\mb{I} - \mb{H}_ i)\bSigma _ i(\mb{I} - \mb{H}_ i’) \]

of Mancl and DeRouen (2001). Finally, the EMPIRICAL= FIROEEQ estimator is based on approximating an unbiased estimating equation (Fay and Graubard 2001). For this estimator, $\mb{A}_ i$ is a diagonal matrix with entries

\[ [\mb{A}_ i]_{jj} = \left(1 - \mr{min}\{ r,[\mb{Q}]_{jj} \} \right)^{-1/2} \]

where $\mb{Q} = \mb{D}_ i’\widehat{\bSigma }_ i^{-1}\mb{D}_ i \widehat{\bOmega }$. The optional number $0 \leq r < 1$ is chosen to provide an upper bound on the correction factor. For r = 0, the classical sandwich estimator results. PROC GLIMMIX chooses as default value $r=3/4$. The diagonal entries of $\mb{A}_ i$ are then no greater than 2.

Table 45.24 summarizes the components of the computation for the GLMM based on linearization, where m denotes the number of subjects and k is the rank of $\mb{X}$.

Table 45.24: Empirical Covariance Estimators for a Linearized GLMM

EMPIRICAL=

c

$\mb{A}_ i$

$\mb{F}_ i$

CLASSICAL

1

$\mb{I}$

$\mb{I}$

DF

$\left\{  \begin{array}{ll} \frac{m}{m-k} &  m > k \\ 1 &  \mr{otherwise} \end{array} \right.$

$\mb{I}$

$\mb{I}$

ROOT

1

$\mb{I}$

$(\mb{I} - \mb{H}_ i’)^{-1/2}$

FIRORES

1

$\mb{I}$

$(\mb{I} - \mb{H}_ i’)^{-1}$

FIROEEQ(r)

1

$\mr{Diag}\{  ( 1 - \mr{min}\{  r,[\mb{Q}]_{jj} \} )^{-1/2} \} $

$\mb{I}$


Computation of an empirical variance estimator requires that the data can be processed by independent sampling units. This is always the case in GLMs. In this case, m equals the sum of all frequencies. In GLMMs, the empirical estimators require that the data consist of multiple subjects. In that case, m equals the number of subjects as per the "Dimensions" table. The following section discusses how the GLIMMIX procedure determines whether the data can be processed by subjects. The section GLM Mode or GLMM Mode explains how PROC GLIMMIX determines whether a model is fit in GLM mode or in GLMM mode.