| The COUNTREG Procedure |
| Zero-Inflated Count Regression Overview |
The main motivation for zero-inflated count models is that real-life data frequently display overdispersion and excess zeros. Zero-inflated count models provide a way of modeling the excess zeros in addition to allowing for overdispersion. In particular, for each observation, there are two possible data generation processes. The result of a Bernoulli trial is used to determine which of the two processes is used. For observation
, Process 1 is chosen with probability
and Process 2 with probability
. Process 1 generates only zero counts. Process 2 generates counts from either a Poisson or a negative binomial model. In general,
![]() |
Therefore, the probability of
can be described as
![]() |
![]() |
![]() |
|||
![]() |
![]() |
![]() |
where
follows either the Poisson or the negative binomial distribution. You can specify the probability
with the PROBZERO= option in the OUTPUT statement.
When the probability
depends on the characteristics of observation
,
is written as a function of
, where
is the
vector of zero-inflation covariates and
is the
vector of zero-inflation coefficients to be estimated. (The zero-inflation intercept is
; the coefficients for the
zero-inflation covariates are
.) The function
that relates the product
(which is a scalar) to the probability
is called the zero-inflation link function,
![]() |
In the COUNTREG procedure, the zero-inflation covariates are indicated in the ZEROMODEL statement. Furthermore, the zero-inflation link function
can be specified as either the logistic function,
![]() |
or the standard normal cumulative distribution function (also called the probit function),
![]() |
The zero-inflation link function is indicated in the LINK option in ZEROMODEL statement. The default ZI link function is the logistic function.
Copyright © SAS Institute, Inc. All Rights Reserved.