The HPCOUNTREG Procedure

Negative Binomial Regression

The Poisson regression model can be generalized by introducing an unobserved heterogeneity term for observation i. Thus, the individuals are assumed to differ randomly in a manner that is not fully accounted for by the observed covariates. This is formulated as

\[ E(y_{i}|\mathbf{x}_{i}, \tau _{i}) = \mu _{i} \tau _{i} = e^{\mathbf{x}_{i}'\bbeta + \epsilon _{i}} \]

where the unobserved heterogeneity term $\tau _{i} = e^{\epsilon _{i}}$ is independent of the vector of regressors $\mathbf{x}_{i}$. Then the distribution of $y_{i}$ conditional on $\mathbf{x}_{i}$ and $\tau _{i}$ is Poisson with conditional mean and conditional variance $\mu _{i} \tau _{i}$:

\[ f(y_{i}|\mathbf{x}_{i},\tau _{i}) = \frac{\exp (-\mu _{i}\tau _{i}) (\mu _{i}\tau _{i})^{y_{i}}}{y_{i}!} \]

Let $g(\tau _{i})$ be the probability density function of $\tau _{i}$. Then, the distribution $f(y_{i}|\mathbf{x}_{i})$ (no longer conditional on $\tau _{i}$) is obtained by integrating $f(y_{i}|\mathbf{x}_{i},\tau _{i})$ with respect to $\tau _{i}$:

\[ f(y_{i}|\mathbf{x}_{i}) = \int _{0}^{\infty } f(y_{i}|\mathbf{x}_{i},\tau _{i}) g(\tau _{i}) d\tau _{i} \]

An analytical solution to this integral exists when $\tau _{i}$ is assumed to follow a gamma distribution. This solution is the negative binomial distribution. If the model contains a constant term, then in order to identify the mean of the distribution, it is necessary to assume that $E(e^{\epsilon _{i}}) = E(\tau _{i}) =1$. Thus, it is assumed that $\tau _{i}$ follows a gamma($\theta , \theta $) distribution with $E(\tau _{i}) =1$ and $V(\tau _{i}) = 1 \slash \theta $,

\[ g(\tau _{i}) = \frac{\theta ^{\theta }}{\Gamma (\theta )} \tau _{i}^{\theta -1}\exp (-\theta \tau _{i}) \]

where $\Gamma (x)=\int _{0}^{\infty }z^{x-1}\exp (-z)dz$ is the gamma function and $\theta $ is a positive parameter. Then, the density of $y_ i$ given $\mathbf{x}_ i$ is derived as

\begin{eqnarray*} f(y_{i}|\mathbf{x}_{i}) & = & \int _{0}^{\infty }f(y_{i}|\mathbf{x}_{i},\tau _{i}) g(\tau _{i})d\tau _{i} \\ & = & \frac{\theta ^{\theta } \mu _{i}^{y_{i}}}{y_{i}!\Gamma (\theta )} \int _{0}^{\infty } e^{-(\mu _{i}+\theta )\tau _{i}}\tau _{i}^{\theta +y_{i}-1}d\tau _{i} \\ & = & \frac{\theta ^{\theta }\mu _{i}^{y_{i}}\Gamma (y_{i}+\theta )}{y_{i}!\Gamma (\theta )(\theta +\mu _{i})^{\theta +y_{i}}} \\ & = & \frac{\Gamma (y_{i}+\theta )}{y_{i}!\Gamma (\theta )} \left(\frac{\theta }{\theta +\mu _{i}}\right)^{\theta } \left(\frac{\mu _{i}}{\theta +\mu _{i}}\right)^{y_{i}} \end{eqnarray*}

If you make the substitution $\alpha =\frac{1}{\theta }$ ($\alpha > 0$), the negative binomial distribution can then be rewritten as

\[ f(y_{i}|\mathbf{x}_{i}) = \frac{\Gamma (y_{i}+\alpha ^{-1})}{y_{i}!\Gamma (\alpha ^{-1})} \left(\frac{\alpha ^{-1}}{\alpha ^{-1}+\mu _{i}}\right)^{\alpha ^{-1}} \left(\frac{\mu _{i}}{\alpha ^{-1}+\mu _{i}}\right)^{y_{i}}, \quad y_{i} = 0,1,2,\ldots \]

Thus, the negative binomial distribution is derived as a gamma mixture of Poisson random variables. It has the conditional mean

\[ E(y_{i}|\mathbf{x}_{i})=\mu _{i} = e^{\mathbf{x}_{i}'\bbeta } \]

and the conditional variance

\[ V(y_{i}|\mathbf{x}_{i}) = \mu _{i} [1+\frac{1}{\theta }\mu _{i}] = \mu _{i}[1+\alpha \mu _{i}] > E(y_{i}|\mathbf{x}_{i}) \]

The conditional variance of the negative binomial distribution exceeds the conditional mean. Overdispersion results from neglected unobserved heterogeneity. The negative binomial model with variance function $V(y_{i}|\mathbf{x}_{i}) = \mu _{i}+\alpha \mu _{i}^2$, which is quadratic in the mean, is referred to as the NEGBIN2 model Cameron and Trivedi (1986). To estimate this model, specify DIST=NEGBIN(P=2) in the MODEL statement. The Poisson distribution is a special case of the negative binomial distribution where $\alpha =0$. A test of the Poisson distribution can be carried out by testing the hypothesis that $\alpha =\frac{1}{\theta _ i}=0$. A Wald test of this hypothesis is provided (it is the reported t statistic for the estimated $\alpha $ in the negative binomial model).

The log-likelihood function of the negative binomial regression model (NEGBIN2) is given by

\begin{eqnarray*} \mathcal{L} & = & \sum _{i=1}^{N} \Bigg\{ \sum _{j=0}^{y_{i}-1}\ln (j+\alpha ^{-1}) -\ln (y_{i}!) \\ & & - (y_{i}+\alpha ^{-1}) \ln (1+\alpha \exp (\mathbf{x}_{i}^{\prime }\bbeta )) +y_{i}\ln (\alpha ) + y_{i}\mathbf{x}_{i}^{\prime }\bbeta \Bigg\} \end{eqnarray*}

where use of the following fact is made if y is an integer:

\[ \Gamma (y+a)/\Gamma (a)= \prod _{j=0}^{y-1} (j+a) \]

Cameron and Trivedi (1986) consider a general class of negative binomial models that have mean $\mu _{i}$ and variance function $\mu _{i}+\alpha \mu _{i}^ p$. The NEGBIN2 model, with $p =2$, is the standard formulation of the negative binomial model. Models that have other values of p, $- \infty < p < \infty $, have the same density $f(y_{i}|\mathbf{x}_{i})$, except that $\alpha ^{-1}$ is replaced everywhere by $\alpha ^{-1} \mu ^{2-p}$. The negative binomial model NEGBIN1, which sets $p=1$, has the variance function $V(y_{i}|\mathbf{x}_{i}) = \mu _{i} + \alpha \mu _{i}$, which is linear in the mean. To estimate this model, specify DIST=NEGBIN(P=1) in the MODEL statement.

The log-likelihood function of the NEGBIN1 regression model is given by

\begin{eqnarray*} \mathcal{L} & = & \sum _{i=1}^{N} \Bigg\{ \sum _{j=0}^{y_{i}-1} \ln \left(j+\alpha ^{-1}\exp (\mathbf{x}_{i}^{\prime }\bbeta ) \right) \\ & & -\ln (y_{i}!) -\left(y_{i}+\alpha ^{-1}\exp (\mathbf{x}_{i}^{\prime }\bbeta )\right)\ln (1+\alpha ) + y_{i}\ln (\alpha ) \Bigg\} \end{eqnarray*}

For more information about the negative binomial regression model, see the section Negative Binomial Regression.