The IRT Procedure

Approximating the Marginal Likelihood

Subsections:

Gauss-Hermite (G-H) Quadrature
Adaptive Gauss-Hermite Quadrature

As discussed in the section Marginal Likelihood, integrations that are involved in the marginal likelihood for IRT model cannot be solved analytically and need to be approximated by using numerical integration, mostly Gauss-Hermite quadrature.

Gauss-Hermite (G-H) Quadrature

In general, the Gauss-Hermite (G-H) quadrature can be presented as

$\int _{-\infty }^{\infty } g(x) \, dx=\int _{-\infty }^{\infty } f(x)\phi (x) \, dx \approx \sum \limits _{g=1}^ G f(x_ g)w_ g$

where G is the number of quadrature points and $x_ g$ and $w_ g$ are the integration points and weights, respectively, which are uniquely determined by the integration domain and the weighting kernel $\phi (x)$ . Traditional G-H quadrature often uses $e^{-x^2}$ as the weighting kernel. In the field of statistics, the density of standard normal distribution is more widely used instead, because for estimating various statistical models, the Gaussian density is often a factor of the integrand. In the case in which the Gaussian density is not a factor of the integrand, the integral is transformed into the form by dividing and multiplying the original integrand by the standard normal density.

Adaptive Gauss-Hermite Quadrature

The G order G-H quadrature is exact if $f(x)$ is a $2K-1$ degree polynomial in x. However, as many researchers (Lesaffre and Spiessens 2001; Rabe-Hesketh, Skrondal, and Pickles 2002) point out, integrands $f(u_ i|\bm {\eta })\phi (\bm {\eta };\bmu ,\bSigma )$ often have sharp peaks and cannot be well approximated by low-degree polynomials in $\bm {\eta }$ . Furthermore, the peak might be far from zero or be located between adjacent quadrature points so that substantial contributions to the integral are lost.

Note that the integrands in the marginal likelihood are a product of the prior density of $\bm {\eta }$ , $\phi (\bm {\eta };\bmu ,\bSigma )$ and the joint probability of responses given $\bm {\eta }$ , $f(\mb{u}_ i|\bm {\eta })$ . After normalization with respect to $\bm {\eta }$ , the integrand, $f(u_ i|\bm {\eta })\phi (\bm {\eta };\bmu ,\bSigma )$ , is just the posterior density of $\bm {\eta }$ , given the observed responses $\mb{u}_ i$ . This posterior density is approximately normal when the number of items is large. Let $\bmu _ i$ and $\bSigma _ i$ be the mean and covariance matrix, respectively, of the posterior density. Then the ratio $\frac{f(\mb{u}_ i|\bm {\eta })\phi (\bm {\eta };\bmu ,\bSigma )}{\phi (\bm {\eta };\bmu _ i,\bSigma _ i)}$ can be well approximated by a low-degree polynomial if the number of items is relatively large. This suggests that the integral should be transformed as

$\begin{equation*} \int f(\mb{u}_ i|\bm {\eta })\phi (\bm {\eta })\, d\bm {\eta } = \int \frac{f(\mb{u}_ i|\bm {\eta })\phi (\bm {\eta };\bmu ,\bSigma )}{\phi (\bm {\eta };\bmu _ i,\bSigma _ i)}\phi (\bm {\eta };\bmu _ i,\bSigma _ i)\, d\bm {\eta } \end{equation*}$

The integration points and weights that correspond to $\phi (\bm {\eta };\bmu _ i,\bSigma _ i)$ are

$z_ g = \bSigma _ i^{1/2}x_ g + \bmu _ i$

$v_ g = |\bSigma _ i|^{1/2} w_ g$

The preceding transformations move and scale the quadrature points to the center of the integrands such that the integrand can be better approximated using many fewer quadrature points.