The example titled "Modeling Zero-Inflation: Is it Better to Fish Poorly or Not to Have Fished At All?" in the FMM procedure documentation discusses zero-inflated and hurdle models for modeling count data containing excessive zeros. As noted there, the hurdle model supposes two processes at work — one that generates zeros with some probability, and the other that generates events. For instance, the Poisson hurdle model is a mixture of a degenerate distribution at zero and a truncated Poisson distribution. The zero-inflated Poisson (ZIP) model also uses a degenerate zero distribution, but the second process is a regular Poisson distribution which can generate both zeros and events. For the ZIP model, the first process therefore generates only extra zeros beyond those of the regular Poisson distribution. For the hurdle model, the first process generates all of the zeros. Truncated Poisson and negative binomial distributions are discussed and illustrated in more detail in this note.
The hurdle model can also be used in cases of underdispersion in which there is less variability in the data than expected under the Poisson distribution.
The example in the FMM documentation illustrates the ZIP model. The Poisson hurdle model is just as easily fit, but uses the DIST=TRUNCPOISSON option instead of the DIST=POISSON option.
proc fmm data=catch; class gender; model count = gender*age / dist=TruncPoisson; model + / dist=Constant; run;
Notice that the results are similar to the ZIP model shown in the FMM documentation.
|
The above Poisson hurdle model can also be fit using PROC NLMIXED. The following statements fit the model and may help to clarify how the model is fit. A logistic model containing only an intercept is used for the zeros process as can be seen by the statements defining the linear predictor for the zeros model (LINPZERO) and the probability of zero (PI). The events process is defined by its linear predictor (LINPNOZERO) and mean count (MUNOZERO) statements. The next two statements define the log likelihood for the Poisson hurdle model.
proc nlmixed data=catch; parameters a0=0 a1=0 a2=0 b0=0; linpzero = b0; pi = 1/(1+exp(-linpzero)); linpnozero = a0 + a1*(gender='F')*age + a2*(gender='M')*age; munozero = exp(linpnozero); logpnozero = log(pi) - log(1-exp(-munozero)) - munozero - lgamma(count+1) + count*log(munozero); if count=0 then ll=log(1-pi); else ll=logpnozero; model count ~ general(ll); run;
Beginning in SAS® 9.3 TS1M2, hurdle models using the negative binomial distribution can also be fit using the DIST=TRUNCNEGBIN option to specify the truncated version of the distribution.
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | SAS/STAT | z/OS | ||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 8 Pro | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows Server 2012 | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |
Type: | Usage Note |
Priority: | |
Topic: | Analytics ==> Categorical Data Analysis Analytics ==> Regression SAS Reference ==> Procedures ==> FMM SAS Reference ==> Procedures ==> NLMIXED |
Date Modified: | 2019-05-03 16:02:38 |
Date Created: | 2012-11-19 17:11:59 |