Contents:  Purpose / History / Requirements / Usage / Details / Limitations Missing Values / References 
Version  Update Notes 
1.1  Added STDP= and STDI= parameters to save the standard errors of the mean and of an individual value to the OUT= data set. 
1.0  Initial coding 
%GLMPI(version, data=out, response=y, pred=p, leverage=h, stdreschi=srp)
The GLMPI macro attempts to check for a later version of itself. If it is unable to do this (such as if there is no active internet connection available), the macro will issue the following message:
GLMPI: Unable to check for newer version
The computations performed by the macro are not affected by the appearance of this message.
Follow the instructions in the Downloads tab of this sample to save the GLMPI macro definition. Replace the text within quotation marks in the following statement with the location of the GLMPI macro definition file on your system. In your SAS program or in the SAS editor window, specify this statement to define the GLMPI macro and make it available for use:
%inc "<location of your file containing the GLMPI macro>";
Following this statement, you can call the GLMPI macro. See the Results tab for examples.
The following parameters are required when using the GLMPI macro:
The following parameters are optional:
Myers and Montgomery (1997) ("M&M") use the delta method to show that the variance of the mean in a generalized linear model is V(y_{i})^{2}[x_{i}'(X'VX)^{1}x_{i}/r(φ)^{2}], where V(y_{i}) is the variance of y_{i}, x_{i} is the vector of predictor values associated with y_{i}, X is the data matrix, V is a diagonal matrix with elements v_{i}=V(y_{i}), and r(φ) is the function of the scale parameter when the response distribution is written in exponential family form as shown in Appendix A of M&M. The variance of the mean can be written in terms of the leverage as v_{i}h_{i}, where h_{i} is the leverage of y_{i}. The leverage values are directly available from PROC GENMOD using the LEVERAGE= option in the OUTPUT statement. v_{i} can be obtained from the standardized Pearson residuals, r_{Pi}, as v_{i} = ((y_{i}μ_{i})/r_{Pi})^{2}/(1h_{i}). An asymptotic 100(1α)% confidence interval for the mean is then ^ y ± z_{1α/2}(v_{i}h_{i})^{½}. Alternatively, a t quantile can be used by specifying the degrees of freedom in the TDF= parameter.
To develop a prediction interval for a future response, M&M further show the variance of the raw residual. This can be written in terms of the leverage as v_{i}(1+h_{i}). Then an asymptotic 100(1α)% prediction interval can be obtained as ^ y ± z_{1α/2}(v_{i}(1+h_{i}))^{½}. Again, a t quantile can be used by specifying the degrees of freedom in the TDF= parameter.
Note that the computed limits are not truncated if they fall outside of the valid range of values for the response distribution. For example, a computed confidence or prediction limit may be negative for a Poisson or gamma distributed response.
Prediction intervals are not very useful when each observation in the data is a single Bernoulli (binary) response, coded 0 or 1. In such a case, a prediction interval for population i with event probability p_{i} will capture 100% of future observations if the limits contain the entire [0,1] range, or 0% of the observations if the limits are both within the [0,1] range, or 100p_{i}% or 100(1p_{i})% of the observations if one limit of the interval falls in the [0,1] range. Prediction intervals might be useful for binomial data in which each observation represents a set of independent Bernoulli trials. Such data are modeled using events/trials syntax in PROC GENMOD. In this case, future observations can have observed event proportions across the [0,1] range and a 100(1α)% prediction interval may obtain its nominal coverage when the limits fall in the [0,1] range.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
proc genmod data=autompg; where cylinders in (4,6,8); class cylinders; model mpg = cylindersweight; output out=out p=p leverage=h stdreschi=srp; run;
The following %INC statement defines the GLMPI macro (necessary only once per SAS session) and then the macro is called to compute 95% confidence and prediction intervals for each observation (vehicle) in the data set. The contents of the input data set (OUT) and the interval limits are saved in the output data set (named GLMPIOUT by default).
%inc "<location of your file containing the GLMPI macro>"; %glmpi(data=out, response=mpg, pred=p, leverage=h, stdreschi=srp)
These statements plot the data points and shows the confidence intervals for the mean (shaded regions) and prediction intervals for future values (dotted lines).
proc sort data=glmpiout; by weight; run; proc sgpanel data=glmpiout noautolegend; panelby cylinders / columns=3; band upper=ucli lower=lcli x=weight / nofill lineattrs=(pattern=2 color=black); band upper=uclm lower=lclm x=weight; reg y=p x=weight / nomarkers; scatter y=mpg x=weight; colaxis grid; rowaxis grid; title "GENMOD fit to MPG data"; title2 "With 95% Confidence and Prediction Limits"; run;
The following statements show that 95.1% of the observations are contained in the prediction intervals.
data glmpiout; set glmpiout; inpi=0; if lcli <= mpg <= ucli then inPI=1; run; proc means n mean; var inPI; title "Prediction interval coverage"; run;
The MEANS Procedure

For this normal, identitylinked model, confidence and prediction intervals can be computed by the GLM procedure. The intervals produced by the GLMPI macro are essentially the same as those from PROC GLM. However, since GLM uses least squares estimation rather than maximum likelihood estimation as in PROC GENMOD, their variance estimates differ slightly. Also, the intervals in GLM use a quantile from the t distribution rather than from the standard normal distribution as in GENMOD. In the following steps, the GLM variance estimate and t quantile are used resulting in the same intervals from the GLMPI macro and PROC GLM.
These statements fit the model in PROC GLM and save the 95% confidence and prediction limits. The ODS OUTPUT statement saves the table containing the square root of the GLM variance estimate (root MSE).
proc glm data=autompg plots=none; where cylinders in (4,6,8); class cylinders; model mpg = cylindersweight; output out=glmout p=p lcl=lcli ucl=ucli lclm=lclm uclm=uclm; ods output fitstatistics=fs; run; quit;
These statements save the GLM root MSE estimate in macro variable s. The model is then refit in GENMOD using the root MSE from GLM. The confidence intervals for the mean are also saved for comparison.
data _null_; set fs; call symput("s",rootmse); run;
proc genmod data=autompg; where cylinders in (4,6,8); class cylinders; model mpg = cylindersweight / scale=&s noscale; output out=out p=p leverage=h stdreschi=srp l=lclmgen u=uclmgen; run;
The GLMPI macro is called and the TDF= parameter is used to specify the error degrees of freedom (385) and request intervals using a t quantile instead of a standard normal quantile.
%glmpi(data=out, response=mpg, pred=p, leverage=h, stdreschi=srp, tdf=385)
These statements arrange and display the confidence limits for the mean (lclm, uclm) and prediction limits (lcli, ucli) from GENMOD, GLM, and the GLMPI macro for the first three observations.
data intervals; set glmpiout; source="GLMPI "; output; lclm=lclmgen; uclm=uclmgen; lcli=.; ucli=.; source="GENMOD"; output; set glmout; source="GLM "; output; run; proc sort data=intervals; by name source; run; proc print data=intervals(obs=9); format _numeric_ 10.7; by name; id source; var lclm uclm lcli ucli; title "Comparison of intervals"; run;
Notice that the interval limits from GLM and the GLMPI macro agree exactly since the same variance estimate and quantile were used. Also notice that the confidence limits from GENMOD differ slightly since GENMOD uses a standard normal quantile rather than a t quantile. The GLMPI limits would match those from GENMOD if the TDF= parameter were omitted from the macro call above. For an identitylinked model like this, the confidence intervals produced by GENMOD are symmetric around the predicted mean, but they will be asymmetric for models using other link functions.
proc genmod data=long97data; where kid5 ne 3; model art = fem mar kid5 ment / dist=poisson scale=p; output out=out p=p leverage=h stdreschi=srp; run; %glmpi(data=out, response=art, pred=p, leverage=h, stdreschi=srp, alpha=.1)
These statements plot the confidence and prediction limits in the various populations defined by the levels of fem, mar, and kid5.
proc sort data=glmpiout; by ment; run; proc sgpanel data=glmpiout noautolegend; panelby fem mar kid5 / columns=4; band upper=ucli lower=lcli x=ment; band upper=uclm lower=lclm x=ment / nofill lineattrs=(color=black); scatter y=art x=ment; title "Poisson fit to article data"; title2 "With 95% Confidence and Prediction Limits"; rowaxis label="Number articles"; colaxis label="Number mentor articles"; run;
These statements show that 93.2% of the response values where captured by the prediction intervals.
data glmpiout; set glmpiout; inpi=0; if lcli <= art <= ucli then inPI=1; run; proc means n mean; var inPI; title "Prediction interval coverage"; run;
The MEANS Procedure

proc genmod data = lifdat; class mfg; model lifetime = mfg / dist=gamma link=log; output out=out p=p leverage=h stdreschi=srp; run; %glmpi(data=out, response=lifetime, pred=p, leverage=h, stdreschi=srp)
These statements plot the data and show the confidence and prediction intervals for each manufacturer.
proc sgplot data=glmpiout; highlow y=mfg low=lcli high=ucli / type=bar name="PI" legendlabel="95% PI" fillattrs=(color=lightgrey); highlow y=mfg low=lclm high=uclm / type=bar name="CI" legendlabel="95% CI" fillattrs=(color=darkgrey); scatter y=mfg x=lifetime; title "Gamma fit to lifetime data"; title2 "With 95% Confidence and Prediction Limits"; xaxis label="Lifetime"; keylegend "CI" "PI"; run;
The following statements show that 94.5% of the observations were contained in the prediction intervals.
data glmpiout; set glmpiout; inpi=0; if lcli <= lifetime <= ucli then inPI=1; run; proc means n mean; var inPI; title "Prediction interval coverage"; run;
The MEANS Procedure

Rightclick on the link below and select Save to save the GLMPI macro definition to a file. It is recommended that you name the file glmpi.sas.
Type:  Sample 
Topic:  Analytics ==> Regression SAS Reference ==> Procedures ==> GENMOD 
Date Modified:  20160316 15:23:32 
Date Created:  20150401 15:06:56 
Product Family  Product  Host  SAS Release  
Starting  Ending  
SAS System  SAS/STAT  z/OS  
Z64  
OpenVMS VAX  
Microsoft® Windows® for 64Bit Itaniumbased Systems  
Microsoft Windows Server 2003 Datacenter 64bit Edition  
Microsoft Windows Server 2003 Enterprise 64bit Edition  
Microsoft Windows XP 64bit Edition  
Microsoft® Windows® for x64  
OS/2  
Microsoft Windows 8 Enterprise 32bit  
Microsoft Windows 8 Enterprise x64  
Microsoft Windows 8 Pro 32bit  
Microsoft Windows 8 Pro x64  
Microsoft Windows 8.1 Enterprise 32bit  
Microsoft Windows 8.1 Enterprise x64  
Microsoft Windows 8.1 Pro  
Microsoft Windows 8.1 Pro 32bit  
Microsoft Windows 95/98  
Microsoft Windows 2000 Advanced Server  
Microsoft Windows 2000 Datacenter Server  
Microsoft Windows 2000 Server  
Microsoft Windows 2000 Professional  
Microsoft Windows NT Workstation  
Microsoft Windows Server 2003 Datacenter Edition  
Microsoft Windows Server 2003 Enterprise Edition  
Microsoft Windows Server 2003 Standard Edition  
Microsoft Windows Server 2003 for x64  
Microsoft Windows Server 2008  
Microsoft Windows Server 2008 R2  
Microsoft Windows Server 2008 for x64  
Microsoft Windows Server 2012 Datacenter  
Microsoft Windows Server 2012 R2 Datacenter  
Microsoft Windows Server 2012 R2 Std  
Microsoft Windows Server 2012 Std  
Microsoft Windows XP Professional  
Windows 7 Enterprise 32 bit  
Windows 7 Enterprise x64  
Windows 7 Home Premium 32 bit  
Windows 7 Home Premium x64  
Windows 7 Professional 32 bit  
Windows 7 Professional x64  
Windows 7 Ultimate 32 bit  
Windows 7 Ultimate x64  
Windows Millennium Edition (Me)  
Windows Vista  
Windows Vista for x64  
64bit Enabled AIX  
64bit Enabled HPUX  
64bit Enabled Solaris  
ABI+ for Intel Architecture  
AIX  
HPUX  
HPUX IPF  
IRIX  
Linux  
Linux for x64  
Linux on Itanium  
OpenVMS Alpha  
OpenVMS on HP Integrity  
Solaris  
Solaris for x64  
Tru64 UNIX 