The COUNTREG Procedure

Zero-Inflated Negative Binomial Regression

The zero-inflated negative binomial (ZINB) model in PROC COUNTREG is based on the negative binomial model with quadratic variance function (p=2). The ZINB model is obtained by specifying a negative binomial distribution for the data generation process referred to earlier as Process 2:

\[  g(y_{i}) = \frac{\Gamma (y_{i}+\alpha ^{-1})}{y_{i}! \Gamma (\alpha ^{-1})}\left(\frac{\alpha ^{-1}}{\alpha ^{-1}+\mu _{i}} \right)^{\alpha ^{-1}}\left(\frac{\mu _{i}}{\alpha ^{-1}+\mu _{i}} \right)^{y_{i}}  \]

Thus the ZINB model is defined to be

$\displaystyle  P(y_{i}=0|\mathbf{x}_{i},\mathbf{z}_{i})  $
$\displaystyle = $
$\displaystyle  F_{i} + \left(1 - F_{i}\right)(1+\alpha \mu _{i})^{-\alpha ^{-1}}  $
$\displaystyle P(y_{i}|\mathbf{x}_{i},\mathbf{z}_{i})  $
$\displaystyle = $
$\displaystyle  \left(1- F_{i} \right) \frac{\Gamma (y_{i}+\alpha ^{-1})}{y_{i}! \Gamma (\alpha ^{-1})}\left(\frac{\alpha ^{-1}}{\alpha ^{-1}+\mu _{i}} \right)^{\alpha ^{-1}}  $
$\displaystyle  $
$\displaystyle  \times  $
$\displaystyle  \left(\frac{\mu _{i}}{\alpha ^{-1}+\mu _{i}} \right)^{y_{i}} , \quad y_{i}>0  $

In this case, the conditional expectation and conditional variance of $y_{i}$ are

\[  E(y_{i}|\mathbf{x}_{i},\mathbf{z}_{i}) = \mu _{i}(1 -F_{i})  \]
\[  V(y_{i}|\mathbf{x}_{i},\mathbf{z}_{i}) = E(y_{i}|\mathbf{x}_{i},\mathbf{z}_{i})\left[1+\mu _{i} (F_{i}+\alpha ) \right]  \]

As with the ZIP model, the ZINB model exhibits overdispersion because the conditional variance exceeds the conditional mean.

ZINB Model with Logistic Link Function

In this model, the probability $\varphi _{i}$ is given by the logistic function—namely,

\[  \varphi _{i}=\frac{\exp (\mathbf{z}_{i}\bgamma )}{1+\exp (\mathbf{z}_{i}\bgamma )}  \]

The log-likelihood function is

$\displaystyle  \mathcal{L}  $
$\displaystyle  =  $
$\displaystyle  \sum _{\{ i: y_{i}=0\} } w_ i\ln \left[\exp (\mathbf{z}_{i}’\bgamma )+(1+\alpha \exp (\mathbf{x}_{i}’\bbeta ))^{-\alpha ^{-1}} \right]  $
$\displaystyle  $
$\displaystyle  +  $
$\displaystyle  \sum _{\{ i: y_{i}>0\} } w_ i\sum _{j=0}^{y_{i}-1}\ln (j+\alpha ^{-1})  $
$\displaystyle  $
$\displaystyle  +  $
$\displaystyle  \sum _{\{ i: y_{i}>0\} } w_ i\left\{  -\ln (y_{i}!) - (y_{i}+\alpha ^{-1}) \ln (1+\alpha \exp (\mathbf{x}_{i}^{\prime }\bbeta )) +y_{i}\ln (\alpha ) + y_{i}\mathbf{x}_{i}^{\prime }\bbeta \right\}   $
$\displaystyle  $
$\displaystyle  -  $
$\displaystyle  \sum _{i=1}^{N}w_ i\ln \left[ 1 + \exp (\mathbf{z}_{i}’\bgamma ) \right]  $

See Poisson Regression for the definition of $w_ i$.

The gradient for this model is given by

$\displaystyle  \frac{\partial \mathcal{L}}{\partial \bgamma }  $
$\displaystyle  =  $
$\displaystyle  \sum _{\{ i: y_{i}=0\} } w_ i\left[\frac{\exp (\mathbf{z}_{i}\bgamma )}{\exp (\mathbf{z}_{i}\bgamma ) + (1+\alpha \exp (\mathbf{x}_{i}\bbeta ))^{-\alpha ^{-1}}}\right] \mathbf{z}_{i}  $
$\displaystyle  $
$\displaystyle  -  $
$\displaystyle  \sum _{i=1}^{N} w_ i\left[\frac{\exp (\mathbf{z}_{i}\bgamma )}{1 + \exp (\mathbf{z}_{i}\bgamma )} \right] \mathbf{z}_{i}  $

$\displaystyle  \frac{\partial \mathcal{L}}{\partial \bbeta }  $
$\displaystyle  =  $
$\displaystyle  \sum _{\{ i: y_{i}=0\} } w_ i\left[\frac{-\exp (\mathbf{x}_{i}\bbeta ) (1+\alpha \exp (\mathbf{x}_{i}\bbeta ))^{-\alpha ^{-1}-1}}{\exp (\mathbf{z}_{i}\bgamma ) + (1+\alpha \exp (\mathbf{x}_{i}\bbeta ))^{-\alpha ^{-1}}}\right] \mathbf{x}_{i}  $
$\displaystyle  $
$\displaystyle  +  $
$\displaystyle  \sum _{\{ i: y_{i}>0\} } w_ i\left[ \frac{y_{i} - \exp (\mathbf{x}_{i}\bbeta )}{1 + \alpha \exp (\mathbf{x}_{i}\bbeta )} \right] \mathbf{x}_{i}  $

\[  \frac{\partial \mathcal{L}}{\partial \alpha } = \sum _{\{ i: y_{i}=0\} } w_ i\frac{ \alpha ^{-2} \left[(1+\alpha \exp (\mathbf{x}_{i}\bbeta )) \ln (1+\alpha \exp (\mathbf{x}_{i}\bbeta )) - \alpha \exp (\mathbf{x}_{i}\bbeta )\right]}{\exp (\mathbf{z}_{i}\bgamma ) (1+\alpha \exp (\mathbf{x}_{i}\bbeta ))^{(1+\alpha )/\alpha } + (1+\alpha \exp (\mathbf{x}_{i}\bbeta ))}  \]
\[  + \sum _{\{ i: y_{i}>0\} } w_ i\left\{  - \alpha ^{-2} \sum _{j=0}^{y_{i}-1} \frac{1}{(j + \alpha ^{-1})} + \alpha ^{-2} \ln (1+\alpha \exp (\mathbf{x}_{i}’\bbeta )) + \frac{y_{i}-\exp (\mathbf{x}_{i}\bbeta )}{\alpha (1+\alpha \exp (\mathbf{x}_{i}\bbeta ))}\right\}   \]

ZINB Model with Standard Normal Link Function

For this model, the probability $\varphi _{i}$ is specified with the standard normal distribution function (probit function): $\varphi _{i}= \Phi (\mathbf{z}_{i}’\bgamma )$. The log-likelihood function is

$\displaystyle  \mathcal{L}  $
$\displaystyle  =  $
$\displaystyle  \sum _{\{ i: y_{i}=0\} } w_ i\ln \left\{  \Phi (\mathbf{z}_{i}’\bgamma ) + \left[ 1 - \Phi (\mathbf{z}_{i}’\bgamma ) \right] (1+\alpha \exp (\mathbf{x}_{i}’\bbeta ))^{-\alpha ^{-1}} \right\}   $
$\displaystyle  $
$\displaystyle  +  $
$\displaystyle  \sum _{\{ i: y_{i}>0\} } w_ i\ln \left[ 1 - \Phi (\mathbf{z}_{i}’\bgamma ) \right]  $
$\displaystyle  $
$\displaystyle  +  $
$\displaystyle  \sum _{\{ i: y_{i}>0\} } w_ i\sum _{j=0}^{y_{i}-1} \left\{  \ln (j+\alpha ^{-1})\right\}   $
$\displaystyle  $
$\displaystyle  -  $
$\displaystyle  \sum _{\{ i: y_{i}>0\} } w_ i\ln (y_{i}!)  $
$\displaystyle  $
$\displaystyle  -  $
$\displaystyle  \sum _{\{ i: y_{i}>0\} } w_ i(y_{i}+\alpha ^{-1}) \ln (1+\alpha \exp (\mathbf{x}_{i}^{\prime }\bbeta ))  $
$\displaystyle  $
$\displaystyle  +  $
$\displaystyle  \sum _{\{ i: y_{i}>0\} } w_ iy_{i}\ln (\alpha )  $
$\displaystyle  $
$\displaystyle  +  $
$\displaystyle  \sum _{\{ i: y_{i}>0\} } w_ iy_{i} \mathbf{x}_{i}^{\prime }\bbeta  $

See Poisson Regression for the definition of $w_ i$.

The gradient for this model is given by

\[  \frac{\partial \mathcal{L}}{\partial \bgamma } = \sum _{\{ i: y_{i}=0\} } w_ i\left[\frac{\varphi (\mathbf{z}_{i}\bgamma ) \left[1-(1+\alpha \exp (\mathbf{x}_{i}\bbeta ))^{-\alpha ^{-1}} \right]}{ \Phi (\mathbf{z}_{i}\bgamma ) + \left[1- \Phi (\mathbf{z}_{i}\bgamma )\right] (1+\alpha \exp (\mathbf{x}_{i}\bbeta ))^{-\alpha ^{-1}}} \right] \mathbf{z}_{i}  \]
\[  - \sum _{\{ i: y_{i}>0\} } w_ i\left[\frac{\varphi (\mathbf{z}_{i}\bgamma )}{1 - \Phi (\mathbf{z}_{i}\bgamma )} \right] \mathbf{z}_{i}  \]
\[  \frac{\partial \mathcal{L}}{\partial \bbeta } = \sum _{\{ i: y_{i}=0\} } w_ i\frac{-\left[1-\Phi (\mathbf{z}_{i}\bgamma )\right] \exp (\mathbf{x}_{i}\bbeta ) (1+\alpha \exp (\mathbf{x}_{i}\bbeta ))^{-(1+\alpha )/\alpha }}{\Phi (\mathbf{z}_{i}\bgamma ) + \left[ 1 - \Phi (\mathbf{z}_{i}\bgamma ) \right] (1+\alpha \exp (\mathbf{x}_{i}\bbeta ))^{-\alpha ^{-1}} } \mathbf{x}_{i}  \]
\[  + \sum _{\{ i: y_{i}>0\} } w_ i\left[ \frac{y_{i} - \exp (\mathbf{x}_{i}\bbeta )}{1 + \alpha \exp (\mathbf{x}_{i}\bbeta )} \right] \mathbf{x}_{i}  \]
\[  \frac{\partial \mathcal{L}}{\partial \alpha } = \sum _{\{ i: y_{i}=0\} } w_ i\frac{\left[ 1-\Phi (\mathbf{z}_{i}\bgamma ) \right]\alpha ^{-2} \left[(1 + \alpha \exp (\mathbf{x}_{i}\bbeta )) \ln (1 + \alpha \exp (\mathbf{x}_{i}\bbeta ))-\alpha \exp (\mathbf{x}_{i}\bbeta )\right]}{\Phi (\mathbf{z}_{i}\bgamma ) (1 + \alpha \exp (\mathbf{x}_{i}\bbeta ))^{(1+\alpha )/\alpha } + \left[1 -\Phi (\mathbf{z}_{i}\bgamma ) \right] (1 + \alpha \exp (\mathbf{x}_{i}\bbeta ))}\\  \]
\[  + \sum _{\{ i: y_{i}>0\} } w_ i\left\{  - \alpha ^{-2} \sum _{j=0}^{y_{i}-1} \frac{1}{(j + \alpha ^{-1})} + \alpha ^{-2} \ln (1+\alpha \exp (\mathbf{x}_{i}’\bbeta )) + \frac{y_{i}-\exp (\mathbf{x}_{i}\bbeta )}{\alpha (1+\alpha \exp (\mathbf{x}_{i}\bbeta ))}\right\}   \]