PROC COUNTREG: Zero-Inflated Count Regression Overview

Zero-Inflated Count Regression Overview

The main motivation for zero-inflated count models is that real-life data frequently display overdispersion and excess zeros. Zero-inflated count models provide a way of modeling the excess zeros in addition to allowing for overdispersion. In particular, for each observation, there are two possible data generation processes. The result of a Bernoulli trial is used to determine which of the two processes is used. For observation $\text{[math]}$ , Process 1 is chosen with probability $\text{[math]}$ and Process 2 with probability $\text{[math]}$ . Process 1 generates only zero counts. Process 2 generates counts from either a Poisson or a negative binomial model. In general,

$\text{[math]}$

Therefore, the probability of $\text{[math]}$ can be described as

	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

where $\text{[math]}$ follows either the Poisson or the negative binomial distribution. You can specify the probability $\text{[math]}$ with the PROBZERO= option in the OUTPUT statement.

When the probability $\text{[math]}$ depends on the characteristics of observation $\text{[math]}$ , $\text{[math]}$ is written as a function of $\text{[math]}$ , where $\text{[math]}$ is the $\text{[math]}$ vector of zero-inflation covariates and $\text{[math]}$ is the $\text{[math]}$ vector of zero-inflation coefficients to be estimated. (The zero-inflation intercept is $\text{[math]}$ ; the coefficients for the $\text{[math]}$ zero-inflation covariates are $\text{[math]}$ .) The function $\text{[math]}$ that relates the product $\text{[math]}$ (which is a scalar) to the probability $\text{[math]}$ is called the zero-inflation link function,

$\text{[math]}$

In the COUNTREG procedure, the zero-inflation covariates are indicated in the ZEROMODEL statement. Furthermore, the zero-inflation link function $\text{[math]}$ can be specified as either the logistic function,

$\text{[math]}$

or the standard normal cumulative distribution function (also called the probit function),

$\text{[math]}$

The zero-inflation link function is indicated in the LINK option in ZEROMODEL statement. The default ZI link function is the logistic function.

The COUNTREG Procedure