Estimation is more difficult in the mixed model than in the general linear model. Not only do you have as in the general linear model, but you also have unknown parameters in
,
, and
. Least squares is no longer the best method. Generalized least squares (GLS) is more appropriate, minimizing
However, GLS requires knowledge of and therefore knowledge of
and
. Lacking such information, one approach is to use an estimated GLS, in which you insert some reasonable estimate for
into the minimization problem. The goal thus becomes finding a reasonable estimate of
and
.
In many situations, the best approach is to use likelihood-based methods, exploiting the assumption that and
are normally distributed (Hartley and Rao 1967; Patterson and Thompson 1971; Harville 1977; Laird and Ware 1982; Jennrich
and Schluchter 1986). PROC HPLMIXED implements two likelihood-based methods: maximum likelihood (ML) and restricted (residual)
maximum likelihood (REML). A favorable theoretical property of ML and REML is that they accommodate data that are missing
at random (Rubin 1976; Little 1995).
PROC HPLMIXED constructs an objective function associated with ML or REML and maximizes it over all unknown parameters. Using
calculus, it is possible to reduce this maximization problem to one over only the parameters in and
. The corresponding log-likelihood functions are as follows:
![]() |
![]() |
![]() |
![]() |
where and
is the rank of
. By default, PROC HPLMIXED actually minimizes a normalized form of
times these functions by using a ridge-stabilized Newton-Raphson algorithm by default. Lindstrom and Bates (1988) provide
reasons for preferring Newton-Raphson to the expectation-maximum (EM) algorithm described in Dempster, Laird, and Rubin (1977)
and Laird, Lange, and Stram (1987), in addition to analytical details for implementing a QR-decomposition approach to the
problem. Wolfinger, Tobias, and Sall (1994) present the sweep-based algorithms that are implemented in PROC HPLMIXED. You
can change the optimization technique with the TECHNIQUE= option in the PROC HPLMIXED statement.
One advantage of using the Newton-Raphson algorithm is that the second derivative matrix of the objective function evaluated
at the optima is available upon completion. Denoting this matrix , the asymptotic theory of maximum likelihood (Serfling 1980) shows that
is an asymptotic variance-covariance matrix of the estimated parameters of
and
. Thus, tests and confidence intervals based on asymptotic normality can be obtained. However, these can be unreliable in
small samples, especially for parameters such as variance components that have sampling distributions that tend to be skewed
to the right.
If a residual variance is a part of your mixed model, it can usually be profiled out of the likelihood. This means solving analytically for the optimal
and plugging this expression back into the likelihood formula (Wolfinger, Tobias, and Sall 1994). This reduces the number
of optimization parameters by 1 and can improve convergence properties. PROC HPLMIXED profiles the residual variance out of
the log likelihood.