To illustrate the use of the different types of plot
statements, consider the following template. In this template, named
MODELFIT, a SCATTERPLOT is overlaid with a REGRESSIONPLOT. The REGRESSIONPLOT
is a computed plot because it takes the input columns (HEIGHT and
WEIGHT) and transforms them into two new columns that correspond to
points on the requested fit line. By default, a linear regression
(DEGREE=1) is performed with other statistical defaults. The model
in this case is WEIGHT=HEIGHT, which in the plot statement is specified
with
X=HEIGHT
(independent variable) and
Y=WEIGHT
(dependent variable). The number of observations
generated for the fit line is around 200 by default.
Note: Plot statements
have to be used in conjunction with Layout statements. To simplify
our discussion, we will continue using the most basic layout statement:
LAYOUT OVERLAY. This layout statement acts as a single container
for all plot statements placed within it. Every plot is drawn on
top of the previous one in the order that the plot statements are
specified, with the last one drawn on top.
proc template;
define statgraph modelfit;
begingraph;
entrytitle "Regression Fit Plot";
layout overlay;
scatterplot x=height y=weight /
primary=true;
regressionplot x=height y=weight;
endlayout;
endgraph;
end;
run;
proc sgrender data=sashelp.class
template=modelfit;
run;
The REGRESSIONPLOT statement can also generate sets of
points for the upper and lower confidence limits of the mean (CLM),
and for the upper and lower confidence limits of individual predicted
values (CLI) for each observation. The CLM="
name" and CLI="
name" options cause
the extra computation. However, the confidence limits are not displayed
by the regression plot. Instead, you must use the dependent plot statement
MODELBAND, with the unique name as its required argument. Notice that
the MODELBAND statement appears first in the template, ensuring that
the band will appear behind the scatter points and fit line. A MODELBAND
statement must be used in conjunction with a REGRESSIONPLOT, LOESSPLOT,
or PBSPLINEPLOT statement.
layout overlay;
modelband "myclm" ;
scatterplot x=height y=weight /
primary=true;
regressionplot x=height y=weight /
alpha=.01 clm="myclm" ;
endlayout;
This is certainly the easiest way to construct this type
of plot. However, you might want to construct a similar plot from
an analysis by a statistical procedure that has many more options
for controlling the fit. Most procedures create output data sets that
can be used directly to create the plot you want. Here is an example
of using non-computed, stand-alone plots to build the fit plot. First
choose a procedure to do the analysis.
proc reg data=sashelp.class noprint;
model weight=height / alpha=.01;
output out=predict predicted=p lclm=lclm uclm=uclm;
run; quit;
The output
data set, PREDICT, contains all the variables and observations in
SASHELP.CLASS plus, for each observation, the computed variables P,
LCLM, and UCLM.
Now the template can use
simple, non-computed SERIESPLOT and BANDPLOT statements for the presentation
of fit line and confidence bands.
proc template;
define statgraph fit;
begingraph;
entrytitle "Regression Fit Plot";
layout overlay;
bandplot x=height
limitupper=uclm
limitlower=lclm /
fillattrs=GraphConfidence;
scatterplot x=height y=weight /
primary=true;
seriesplot x=height y=p /
lineattrs=GraphFit;
endlayout;
endgraph;
end;
run;
proc sgrender data=predict template=fit;
run;