The GLIMMIX Procedure

Generalized Linear Models Theory

A generalized linear model consists of the following:

  • a linear predictor $\eta = \mb{x}’\bbeta $

  • a monotonic mapping between the mean of the data and the linear predictor

  • a response distribution in the exponential family of distributions

A density or mass function in this family can be written as

\[ f(y) = \exp \left\{ \frac{y\theta - b(\theta )}{\phi } + c(y,f(\phi )) \right\} \]

for some functions $b(\cdot )$ and $c(\cdot )$. The parameter $\theta $ is called the natural (canonical) parameter. The parameter $\phi $ is a scale parameter, and it is not present in all exponential family distributions. See Table 45.20 for a list of distributions for which $\phi \equiv 1$. In the case where observations are weighted, the scale parameter is replaced with $\phi /w$ in the preceding density (or mass function), where w is the weight associated with the observation y.

The mean and variance of the data are related to the components of the density, $\mr{E}[Y] = \mu = b’(\theta )$, $\mr{Var}[Y] = \phi b”(\theta )$, where primes denote first and second derivatives. If you express $\theta $ as a function of $\mu $, the relationship is known as the natural link or the canonical link function. In other words, modeling data with a canonical link assumes that $\theta = \mb{x}’\bbeta $; the effect contributions are additive on the canonical scale. The second derivative of $b(\cdot )$, expressed as a function of $\mu $, is the variance function of the generalized linear model, $a(\mu ) = b”(\theta (\mu ))$. Note that because of this relationship, the distribution determines the variance function and the canonical link function. You cannot, however, proceed in the opposite direction. If you provide a user-specified variance function, the GLIMMIX procedure assumes that only the first two moments of the response distribution are known. The full distribution of the data is then unknown and maximum likelihood estimation is not possible. Instead, the GLIMMIX procedure then estimates parameters by quasi-likelihood.