Newton-Raphson Algorithm


$\displaystyle  \mb {g}  $
$\displaystyle  =  $
$\displaystyle  \sum _ j w_ jf_ j\frac{\partial l_ j}{\partial \btheta }  $
$\displaystyle \mb {H}  $
$\displaystyle  =  $
$\displaystyle  \sum _ j - w_ jf_ j\frac{\partial ^2 l_ j}{\partial \btheta ^2}  $

be the gradient vector and the Hessian matrix, where $l_ j=\log L_ j$ is the log likelihood for the jth observation. With a starting value of $\btheta ^{(0)}$, the pseudo-estimate $\hat{\btheta }$ of $\btheta $ is obtained iteratively until convergence is obtained:

\[  \btheta ^{(i+1)}=\btheta ^{(i)} +\mb {H}^{-1}\mb {g}  \]

where $\mb {H}$ and $\mb {g}$ are evaluated at the ith iteration $\btheta ^{(i)}$. If the log likelihood evaluated at $\btheta ^{(i+1)}$ is less than that evaluated at $\btheta ^{(i)}$, then $\btheta ^{(i+1)}$ is recomputed by step-halving or ridging. The iterative scheme continues until convergence is obtained—that is, until $\btheta ^{(i+1)}$ is sufficiently close to $\btheta ^{(i)}$. Then the maximum likelihood estimate of $\btheta $ is $\hat{\btheta }=\btheta ^{(i+1)}$.