| Tutorial: A Module for Linear Regression |
You can create some simple plots by using the PGRAF subroutine. The PGRAF subroutine produces scatter plots suitable for printing on a line printer. If you want to produce better-quality graphics that include color, you can use the graphics capabilities of IML (see Chapter 12 for more information).
Here is how you can plot the residuals against
.
First, create a matrix containing the pairs of points by
concatenating X1 with RESID,
using the horizontal concatenation operator (
):
> xy=x1||resid;
XY 5 rows 2 cols (numeric)
1 -0.2
2 1
3 -1.8
4 1.4
5 -0.4
Next, use a CALL statement to call the PGRAF
subroutine to produce the desired plot.
The arguments to PGRAF are as follows, in the order shown:
> call pgraf(xy,'r','x','Residuals','Plot of Residuals');
Plot of Residuals
2 +
|
| r
R |
e | r
s |
i |
d 0 +
u | r r
a |
l |
s |
|
| r
-2 +
--------+------+------+------+------+------+------+------+------+--------
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
x
You can also plot the predicted values
against
.
You must first create a matrix - say,
XYH - containing the points.
Do this by concatenating X1 with YHAT.
Next, call the PGRAF subroutine to plot the points. You can
perform these operations by using the following statements:
> xyh=x1||yhat;
XYH 5 rows 2 cols (numeric)
1 1.2
2 4
3 10.8
4 21.6
5 36.4
> call pgraf(xyh,'*','x','Predicted','Plot of Predicted Values');
Plot of Predicted Values
40 +
| *
|
P |
r |
e |
d | *
i 20 +
c |
t |
e | *
d |
|
| *
0 + *
--------+------+------+------+------+------+------+------+------+--------
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
x
You can get a more detailed plot, denoting the observed
values with a "y" and the predicted values
with a "p" by using the following statements.
Create a matrix NEWXY containing
the pairs of points to overlay.
You need to use both the horizontal concatenation
operator (
) and the vertical concatenation operator (//).
The NROW function returns the number of observations -
that is, the number of rows of X1.
The matrix LABEL contains the character label
for each point, plotting a "y" for each observed
point and a "p" for each predicted point.
> newxy=(x1//x1)||(y//yhat);
NEWXY 10 rows 2 cols (numeric)
1 1
2 5
3 9
4 23
5 36
1 1.2
2 4
3 10.8
4 21.6
5 36.4
> n=nrow(x1);
N 1 row 1 col (numeric)
5
> label=repeat('y',n,1)//repeat('p',n,1);
LABEL 10 rows 1 col (character, size 1)
y
y
y
y
y
p
p
p
p
p
> call pgraf(newxy,label,'x','y','Scatter Plot with Regression Line' );
Scatter Plot with Regression Line
y 40 +
| y
|
|
|
|
| y
20 +
|
|
| p
| y
| y
| p
0 + y
--------+------+------+------+------+------+------+------+------+----
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
x
As you can see, the observed and predicted
values are too close together to be distinguishable
at all values of
.
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.