In this example, the number of maintenance repairs on a complex system are modeled as realizations of Poisson random variables. The system under investigation has a large number of components, which occasionally break down and are replaced or repaired. During a fouryear period, the system was observed to be in a state of steady operation, meaning that the rate of operation remained approximately constant. A monthly maintenance record is available for that period, which tracks the number of components removed for maintenance each month. The data are listed in the following statements, which create a SAS data set:
title 'Analysis of Component Reliability'; data equip; input year month removals @@; datalines; 1987 1 2 1987 2 4 1987 3 3 1987 4 3 1987 5 3 1987 6 8 1987 7 2 1987 8 6 1987 9 3 1987 10 9 1987 11 4 1987 12 10 1988 1 4 1988 2 6 1988 3 4 1988 4 4 1988 5 3 1988 6 5 1988 7 3 1988 8 4 1988 9 5 1988 10 3 1988 11 6 1988 12 3 1989 1 2 1989 2 6 1989 3 1 1989 4 5 1989 5 5 1989 6 4 1989 7 2 1989 8 2 1989 9 2 1989 10 5 1989 11 1 1989 12 10 1990 1 3 1990 2 8 1990 3 12 1990 4 7 1990 5 3 1990 6 2 1990 7 4 1990 8 3 1990 9 0 1990 10 6 1990 11 6 1990 12 6 ;
For planning purposes, it is of interest to understand the long and shortterm trends in the maintenance needs of the system. Over the long term, it is suspected that the quality of new components and repair work improves over time, so the number of component removals would tend to decrease from year to year. It is not known whether the robustness of the system is affected by seasonal variations in the operating environment, but this possibility is also of interest.
Because the maintenance record is in the form of counts, the number of removals are modeled as realizations of Poisson random variables. Denote by the unobserved component removal rate for year i and month j. Since the data were recorded at regular intervals (from a system operating at a constant rate), each is assumed to be a function of year and month only.
A preliminary twoway analysis is performed by using PROC GENMOD to make broad inferences on repair trends. A loglink is specified for the model



where is a grand mean, is the effect of the ith year, and is the effect of the jth month.
In the following statements, the CLASS statement declares the variables year
and month
as categorical. Type III sum of squares are requested to test whether there is an overall effect of year and/or month.
title2 'Twoway model'; proc genmod data=equip; class year month; model removals=year month / dist=Poisson link=log type3; run;
Output 39.2.1 displays the listed Type III statistics for the fitted model. With the test for year effects yielding a pvalue of 0.4527, there is no evidence of a longterm trend in maintenance rates. Apparently, the quality of new or repaired components did not change between 1987 and 1990. However, the test for monthly trends does yield a small pvalue of 0.0321, indicating that seasonal trends are significant at the level.
Output 39.2.1: PROC GENMOD Listing for Type III Analysis
Analysis of Component Reliability 
Twoway model 
LR Statistics For Type 3 Analysis  

Source  DF  ChiSquare  Pr > ChiSq 
year  3  2.63  0.4527 
month  11  21.12  0.0321 
If year
is dropped from the model, the focus of the analysis is now on identifying the form of the underlying seasonal trend, which
is a task that PROC GAM is especially suited for. PROC GAM will be used to fit both a reduced categorical model, with year
eliminated, and a nonparametric spline model. Although PROC GENMOD also has the capability to fit categorical models, as
demonstrated earlier, PROC GAM will be used here to fit both models for a better comparison.
The following PROC GAM statements specify the reduced categorical model and write predicted values to a data set. For this
part of the analysis, a CLASS statement is again used to specify that month
is a categorical variable. In the followup, the seasonal effect will be treated as a nonparametric function of month
.
title2 'Oneway model'; proc gam data=equip; class month; model removals=param(month) / dist=Poisson; output out=est p; run;
The following statements use the SGPLOT procedure to generate a plot of the estimated seasonal trend. The plot is displayed in Output 39.2.2.
proc sort data=est;by month;run; proc sgplot data=est; title "Predicted Seasonal Trend"; yaxis label="Number of Removals"; xaxis integer values=(1 to 12); scatter x=Month y=Removals / name="points" legendLabel="Removals"; series x=Month y=p_Removals / name="line" legendLabel="Predicted Removals" lineattrs = GRAPHFIT; discretelegend "points" "line"; run;
Output 39.2.2: Predicted Seasonal Trend from a Parametric Model Fit Using a CLASS Statement
The predicted repair rates shown in Output 39.2.2 form a jagged seasonal pattern. Ignoring the monthtomonth fluctuations, which are difficult to explain and can be artifacts of random noise, the general removal rate trend is high in winter and low in summer.
One advantage of nonparametric regression is its ability to highlight general trends in the data, such as those described earlier, and to attribute local fluctuations to unexplained random noise. The nonparametric regression model used by PROC GAM specifies that the underlying removal rates are of the form

where is a linear coefficient and is a nonparametric regression function. and define the linear and nonparametric parts, respectively, of the seasonal trend.
The following statements request that PROC GAM fit a cubic spline model to the monthly repair data. The output listing is displayed in Output 39.2.3 and Output 39.2.4.
title 'Analysis of Component Reliability'; title2 'Spline model'; proc gam data=equip; model removals=spline(month) / dist=Poisson method=gcv; run;
The METHOD=GCV option is used to determine an appropriate level of smoothing.
Output 39.2.3: PROC GAM Listing for Cubic Spline Regression Using the METHOD=GCV Option
Analysis of Component Reliability 
Spline model 
Summary of Input Data Set  

Number of Observations  48 
Number of Missing Observations  0 
Distribution  Poisson 
Link Function  Log 
Output 39.2.4: Model Fit Statistics
Regression Model Analysis Parameter Estimates 


Parameter  Parameter Estimate 
Standard Error 
t Value  Pr > t 
Intercept  1.34594  0.14509  9.28  <.0001 
Linear(month)  0.02274  0.01893  1.20  0.2362 
Smoothing Model Analysis Fit Summary for Smoothing Components 


Component  Smoothing Parameter 
DF  GCV  Num Unique Obs 
Spline(month)  0.901512  1.879980  0.115848  12 
Smoothing Model Analysis Analysis of Deviance 


Source  DF  Sum of Squares  ChiSquare  Pr > ChiSq 
Spline(month)  1.87998  8.877764  8.8778  0.0103 
Notice in the listing in Output 39.2.4 that the DF value chosen for the nonlinear portion of the spline by minimizing GCV is about 1.88, which is smaller than the default value of 3. This indicates that the spline model of the seasonal trend is relatively simple. As indicated by the “Analysis of Deviance” table, it is a significant feature of the data. The table lists a pvalue of 0.0103 for the hypothesis of no seasonal trend. Note also that the “Parameter Estimates” table lists a pvalue of 0.2362 for the hypothesis of no linear factor in the seasonal trend indicating no significant linear trend.
The following statements use ODS Graphics to plot the smoothing component for the effect of Month
on predicted repair rates. The CLM suboption for the PLOTS=COMPONENTS option adds a 95% confidence band to the fit.
ods graphics on; proc gam data=equip plots=components(clm); model removals=spline(month) / dist=Poisson method=gcv; run; ods graphics off;
For general information about ODS graphics, see Chapter 21: Statistical Graphics Using ODS. For specific information about the graphics available in the GAM procedure, see the section ODS Graphics. The smoothing component plot is displayed in Output 39.2.5.
In Output 39.2.5, it is apparent that the pattern of repair rates follows the general pattern observed in Output 39.2.2. However, the plot in Output 39.2.5 is much cleaner because the monthtomonth fluctuations are smoothed out to reveal the broader seasonal trend.
Output 39.2.5: Estimated Nonparametric Factor of Seasonal Trend, Along with 95% Confidence Bounds
In Output 39.2.1 the small pvalue (p = 0.0321) for the hypothesis of no seasonal trend indicates that the data exhibit significant seasonal structure. Output 39.2.5 is a graphical illustration of the seasonality of the number of removals.