

               The most widely used model for count data analysis is Poisson regression. This assumes that 
, given the vector of covariates 
, is independently Poisson-distributed with 
            
![\[ P(Y_{i}=y_{i}|\mathbf{x}_{i}) = \frac{e^{-\mu _{i}}\mu _{i}^{y_{i}}}{y_{i}!}, \quad y_ i = 0,1,2,\ldots \]](images/etsug_countreg0079.png)
and the mean parameter (that is, the mean number of events per period) is given by
![\[ \mu _{i} = \exp (\mathbf{x}_{i}^{\prime } \bbeta ) \]](images/etsug_countreg0080.png)
 where 
 is a 
 parameter vector. (The intercept is 
; the coefficients for the k regressors are 
.) Taking the exponential of 
 ensures that the mean parameter 
 is nonnegative. It can be shown that the conditional mean is given by 
            
![\[ E(y_{i}|\mathbf{x}_{i}) = \mu _{i} = \exp (\mathbf{x}_{i}^{\prime } \bbeta ) \]](images/etsug_countreg0086.png)
The name log-linear model is also used for the Poisson regression model because the logarithm of the conditional mean is linear in the parameters:
![\[ \ln [E(y_{i}|\mathbf{x}_{i})] = \ln (\mu _{i}) = \mathbf{x}_{i}^{\prime } \bbeta \]](images/etsug_countreg0087.png)
Note that the conditional variance of the count random variable is equal to the conditional mean in the Poisson regression model:
![\[ V(y_{i}|\mathbf{x}_{i}) = E(y_{i}|\mathbf{x}_{i}) = \mu _{i} \]](images/etsug_countreg0088.png)
 The equality of the conditional mean and variance of 
 is known as equidispersion. 
            
The marginal effect of a regressor is given by
![\[ \frac{\partial E(y_{i}|\mathbf{x}_{i})}{\partial x_{ji}} = \exp (\mathbf{x}_{i}^{\prime } \bbeta ) \bbeta _{j} = E(y_{i}|\mathbf{x}_{i}) \beta _{j} \]](images/etsug_countreg0089.png)
 Thus, a one-unit change in the jth regressor leads to a proportional change in the conditional mean 
 of 
. 
            
The standard estimator for the Poisson model is the maximum likelihood estimator (MLE). Because the observations are independent, the log-likelihood function is written as
![\[ \mathcal{L} = \sum _{i=1}^{N}w_ i(-\mu _{i} + y_{i} \ln \mu _{i} - \ln y_{i}!) = \sum _{i=1}^{N}w_ i(-e^{\mathbf{x}_{i}^{\prime } \bbeta } + y_{i}\mathbf{x}_{i}^{\prime } \bbeta - \ln y_{i}!) \]](images/etsug_countreg0091.png)
 where 
 is defined as follows: 
            
if neither the WEIGHT nor FREQ statement is used.

where 
 are the nonnormalized values of the variable that are specified in the WEIGHT statement in which the NONORMALIZE option is
                        specified. 
                     

where 
 are the nonnormalized values of the variable that is specified in the WEIGHT statement. 
                     

where 
 are the values of the variable that is specified in the FREQ statement. 
                     

if both the WEIGHT statement, without the NONORMALIZE option, and the FREQ statement are specified.

if both the FREQ and WEIGHT statements are specified.
The gradient and the Hessian are, respectively,
![\[ \frac{\partial \mathcal{L}}{\partial \bbeta } = \sum _{i=1}^{N}w_ i(y_{i}-\mu _{i})\mathbf{x}_{i} = \sum _{i=1}^{N}w_ i(y_{i}-e^{\mathbf{x}_{i}^{\prime }\bbeta })\mathbf{x}_{i} \]](images/etsug_countreg0097.png)
![\[ \frac{\partial ^2 \mathcal{L}}{\partial \bbeta \partial \bbeta ^{\prime }} = - \sum _{i=1}^{N}w_ i\mu _{i}\mathbf{x}_{i}{\mathbf{x}_{i}}^{\prime } = - \sum _{i=1}^{N}w_ ie^{\mathbf{x}_{i}^{\prime } \bbeta } \mathbf{x}_{i} \mathbf{x}_{i}^{\prime } \]](images/etsug_countreg0098.png)
The Poisson model has been criticized for its restrictive property that the conditional variance must equal the conditional mean. Real-life data are often characterized by overdispersion (that is, the variance exceeds the mean). Allowing for overdispersion can improve model predictions because the Poisson restriction of equal mean and variance results in the underprediction of zeros when overdispersion exists. The most commonly used model that accounts for overdispersion is the negative binomial model. Conway-Maxwell-Poisson regression enables you to model both overdispersion and underdispersion.