 Previous Page | Next Page

 The REG Procedure
 Line Printer Scatter Plot Features

This section discusses the special options available with line printer scatter plots. Detailed examples of traditional graphics and options are given in the section Traditional Graphics.

### Producing Scatter Plots

The interactive PLOT statement available in PROC REG enables you to look at scatter plots of data and diagnostic statistics. These plots can help you to evaluate the model and detect outliers in your data. Several options enable you to place multiple plots on a single page, superimpose plots, and collect plots to be overlaid by later plots. The PAINT statement can be used to highlight points on a plot. See the section Painting Scatter Plots for more information about painting.

The Class data set introduced in the section Simple Linear Regression is used in the following examples.

You can superimpose several plots with the OVERLAY option. With the following statements, a plot of Weight against Height is overlaid with plots of the predicted values and the 95% prediction intervals. The model on which the statistics are based is the full model including Height and Age. These statements produce the plot in Figure 74.34:

```proc reg data=Class lineprinter;
model Weight=Height Age / noprint;
plot (ucl. lcl. p.)*Height='-' Weight*Height
/ overlay symbol='o';
run;
```

Figure 74.34 Scatter Plot Showing Data, Predicted Values, and Confidence Limits
The REG Procedure
Model: MODEL1
Dependent Variable: Weight

 ``` ---+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+---- U U95 | | p 200 + + p | | e | | r | - | 150 + - - o + B | -- - -o - | o | - - o - - | u | - - - o -- - ? o - | n 100 + -- - o ? ? o - + d | - o oo - ? o ? - -- | | ?- -- - - - | o | - - - | f 50 + o -- -- + | | 9 | - | 5 | | % 0 + + | | C ---+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+---- . 50 52 54 56 58 60 62 64 66 68 70 72 I Height ```

In this plot, the data values are marked with the symbol ’o’ and the predicted values and prediction interval limits are labeled with the symbol ’-’. The plot is scaled to accommodate the points from all plots. This is an important difference from the COLLECT option, which does not rescale plots after the first plot or plots are collected. You could separate the overlaid plots by using the following statements:

```plot;
run;
```

This places each of the four plots on a separate page, while the statements

```plot / overlay;
run;
```

repeat the previous overlaid plot. In general, the statement

```plot;
```

is equivalent to respecifying the most recent PLOT statement without any options. However, the COLLECT, HPLOTS=, SYMBOL=, and VPLOTS= options apply across PLOT statements and remain in effect.

The next example shows how you can overlay plots of statistics before and after a change in the model. For the full model involving Height and Age, the ordinary residuals and the studentized residuals are plotted against the predicted values. The COLLECT option causes these plots to be collected or retained for redisplay later. The option HPLOTS=2 enables the two plots to appear side by side on one page. The symbol ’f’ is used on these plots to identify them as resulting from the full model. These statements produce Figure 74.35:

```plot r.*p. student.*p. / collect hplots=2 symbol='f';
run;
```

Figure 74.35 Collecting Residual Plots for the Full Model
The REG Procedure
Model: MODEL1

 ``` -+-----+-----+-----+-----+-----+- -+-----+-----+-----+-----+-----+-- | | | | 20 + f + 3 + + | | | | | | | | | f f | | f | | f | 2 + + 10 + f + | | R | | S | | E | f | T | f f f | S | | U 1 + f + I | f | D | | D 0 + f f + E | f | U | f | N | f | A | f | T 0 + f f + L | f f | | f | | | | f f | -10 + + | f | | f f | -1 + + | | | f f | | f | | f f | | f | | | -20 + + -2 + + -+-----+-----+-----+-----+-----+- -+-----+-----+-----+-----+-----+-- 40 60 80 100 120 140 40 60 80 100 120 140 PRED PRED ```

Note that these plots are not overlaid. The COLLECT option does not overlay the plots in one PLOT statement but retains them so that they can be overlaid by later plots. When the COLLECT option appears in a PLOT statement, the plots in that statement become the first plots in the collection.

Next, the model is reduced by deleting the Age variable. The PLOT statement requests the same plots as before but labels the points with the symbol ’r’ denoting the reduced model. The following statements produce Figure 74.36:

```delete Age;
plot r.*p. student.*p. / symbol='r';
run;
```

Figure 74.36 Overlaid Residual Plots for Full and Reduced Models
The REG Procedure
Model: MODEL1.1

 ``` -+-----+-----+-----+-----+-----+- -+-----+-----+-----+-----+-----+-- | | | | 20 + f + 3 + + | r | | | | | | | | f rf | | f | | r r ? | 2 + + 10 + f + | r | R | | S | r | E | rf | T | ? f ? | S | r | U 1 + rf + I | ? | D | r | D 0 + ? ? + E | f | U | rf | N | ? | A | f r | T 0 + ? ? + L | ? r f | | rf | | | | ? ? | -10 + + | ? | | f f | -1 + + | r r | | ? fr | | ? | | ? f | | ? | | r | -20 + + -2 + + -+-----+-----+-----+-----+-----+- -+-----+-----+-----+-----+-----+-- 40 60 80 100 120 140 40 60 80 100 120 140 PRED PRED ```

Notice that the COLLECT option causes the corresponding plots to be overlaid. Also notice that the DELETE statement causes the model label to be changed from MODEL1 to MODEL1.1. The points labeled ’f’ are from the full model, and the points labeled ’r’ are from the reduced model. Positions labeled ’?’ contain at least one point from each model. In this example, the OVERLAY option cannot be used because all of the plots to be overlaid cannot be specified in one PLOT statement. With the COLLECT option, any changes to the model or the data used to fit the model do not affect plots collected before the changes. Collected plots are always reproduced exactly as they first appear. (Similarly, a PAINT statement does not affect plots collected before the PAINT statement is issued.)

The previous example overlays the residual plots for two different models. You might prefer to see them side by side on the same page. This can also be done with the COLLECT option by using a blank plot. Continuing from the last example, the COLLECT, HPLOTS=2, and SYMBOL=’r’ options are still in effect. In the following PLOT statement, the CLEAR option deletes the collected plots and enables the specified plot to begin a new collection. The plot created is the residual plot for the reduced model. These statements produce Figure 74.37:

```plot r.*p. / clear;
run;
```

Figure 74.37 Residual Plot for Reduced Model Only
The REG Procedure
Model: MODEL1.1

 ``` -+-----+-----+-----+-----+-----+- | | 20 + + | r | | | | r | | r r r | 10 + + R | | E | r | S | r | I | r | D 0 + r r + U | r | A | r | L | r r | | | -10 + + | | | r r | | r | | r | -20 + + -+-----+-----+-----+-----+-----+- 40 60 80 100 120 140 PRED ```

The next statements add the variable AGE to the model and place the residual plot for the full model next to the plot for the reduced model. Notice that a blank plot is created in the first plot request by placing nothing between the quotes. Since the COLLECT option is in effect, this plot is superimposed on the residual plot for the reduced model. The residual plot for the full model is created by the second request. The result is the desired side-by-side plots. The NOCOLLECT option turns off the collection process after the specified plots are added and displayed. Any PLOT statements that follow show only the newly specified plots. These statements produce Figure 74.38:

```add Age;
plot r.*p.='' r.*p.='f' / nocollect;
run;
```

Figure 74.38 Side-by-Side Residual Plots for the Full and Reduced Models
The REG Procedure
Model: MODEL1
Dependent Variable: Weight

 ``` ---+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+---- U U95 | | p 200 + + p | | e | | r | - | 150 + - - o + B | -- - -o - | o | - - o - - | u | - - - o -- - ? o - | n 100 + -- - o ? ? o - + d | - o oo - ? o ? - -- | | ?- -- - - - | o | - - - | f 50 + o -- -- + | | 9 | - | 5 | | % 0 + + | | C ---+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+---- . 50 52 54 56 58 60 62 64 66 68 70 72 I Height ```

The REG Procedure
Model: MODEL1

 ``` -+-----+-----+-----+-----+-----+- -+-----+-----+-----+-----+-----+-- | | | | 20 + f + 3 + + | | | | | | | | | f f | | f | | f | 2 + + 10 + f + | | R | | S | | E | f | T | f f f | S | | U 1 + f + I | f | D | | D 0 + f f + E | f | U | f | N | f | A | f | T 0 + f f + L | f f | | f | | | | f f | -10 + + | f | | f f | -1 + + | | | f f | | f | | f f | | f | | | -20 + + -2 + + -+-----+-----+-----+-----+-----+- -+-----+-----+-----+-----+-----+-- 40 60 80 100 120 140 40 60 80 100 120 140 PRED PRED ```

The REG Procedure
Model: MODEL1.1

 ``` -+-----+-----+-----+-----+-----+- -+-----+-----+-----+-----+-----+-- | | | | 20 + f + 3 + + | r | | | | | | | | f rf | | f | | r r ? | 2 + + 10 + f + | r | R | | S | r | E | rf | T | ? f ? | S | r | U 1 + rf + I | ? | D | r | D 0 + ? ? + E | f | U | rf | N | ? | A | f r | T 0 + ? ? + L | ? r f | | rf | | | | ? ? | -10 + + | ? | | f f | -1 + + | r r | | ? fr | | ? | | ? f | | ? | | r | -20 + + -2 + + -+-----+-----+-----+-----+-----+- -+-----+-----+-----+-----+-----+-- 40 60 80 100 120 140 40 60 80 100 120 140 PRED PRED ```

The REG Procedure
Model: MODEL1.1

 ``` -+-----+-----+-----+-----+-----+- | | 20 + + | r | | | | r | | r r r | 10 + + R | | E | r | S | r | I | r | D 0 + r r + U | r | A | r | L | r r | | | -10 + + | | | r r | | r | | r | -20 + + -+-----+-----+-----+-----+-----+- 40 60 80 100 120 140 PRED ```

Frequently, when the COLLECT option is in effect, you want the current and following PLOT statements to show only the specified plots. To do this, use both the CLEAR and NOCOLLECT options in the current PLOT statement.

### Painting Scatter Plots

Painting scatter plots is a useful interactive tool that enables you to mark points of interest in scatter plots. Painting can be used to identify extreme points in scatter plots or to reveal the relationship between two scatter plots. The Class data (from the section Simple Linear Regression) is used to illustrate some of these applications.

The following statements produce the scatter plot of the studentized residuals against the predicted values in Figure 74.39.

```proc reg data=Class lineprinter;
model Weight=Age Height / noprint;
plot student.*p.;
run;
```

Figure 74.39 Plotting Studentized Residuals against Predicted Values
The REG Procedure
Model: MODEL1
Dependent Variable: Weight

 ``` ---+------+------+------+------+------+------+------+------+------+--- S 4 + + t | | u | | d | | e | | n | 1 | t 2 + + i | | z | 1 1 1 | e STUDENT | 1 | d | 11 | | 1 | R 0 + 1 1 + e | 1 1 2 | s | 1 | i | 1 | d | 1 1 | u | 1 | a -2 + + l ---+------+------+------+------+------+------+------+------+------+--- 50 60 70 80 90 100 110 120 130 140 Predicted Value of Weight PRED ```

Then, the following statements identify the observation ’Henry’ in the scatter plot and produce the plot in Figure 74.40:

```paint Name='Henry' / symbol = 'H';
plot;
run;
```

Figure 74.40 Painting One Observation
The REG Procedure
Model: MODEL1
Dependent Variable: Weight

 ``` ---+------+------+------+------+------+------+------+------+------+--- S 4 + + t | | u | | d | | e | | n | 1 | t 2 + + i | | z | 1 1 1 | e STUDENT | 1 | d | 11 | | 1 | R 0 + 1 1 + e | 1 H 2 | s | 1 | i | 1 | d | 1 1 | u | 1 | a -2 + + l ---+------+------+------+------+------+------+------+------+------+--- 50 60 70 80 90 100 110 120 130 140 Predicted Value of Weight PRED ```

Next, the following statements identify observations with large absolute residuals:

```paint student.>=2 or student.<=-2 / symbol='s';
plot;
run;
```

The log shows the observation numbers found with these conditions and gives the painting symbol and the number of observations found. Note that the previous PAINT statement is also used in the PLOT statement. Figure 74.41 shows the scatter plot produced by the preceding statements.

Figure 74.41 Painting Several Observations
The REG Procedure
Model: MODEL1
Dependent Variable: Weight

 ``` ---+------+------+------+------+------+------+------+------+------+--- S 4 + + t | | u | | d | | e | | n | s | t 2 + + i | | z | 1 1 1 | e STUDENT | 1 | d | 11 | | 1 | R 0 + 1 1 + e | 1 H 2 | s | 1 | i | 1 | d | 1 1 | u | 1 | a -2 + + l ---+------+------+------+------+------+------+------+------+------+--- 50 60 70 80 90 100 110 120 130 140 Predicted Value of Weight PRED ```

The following statements relate two different scatter plots. These statements produce the plot in Figure 74.42.

```paint student.>=1 / symbol='p';
paint student.<1 and student.>-1 / symbol='s';
paint student.<=-1 / symbol='n';
plot student. * p. cookd. * h. / hplots=2;
run;
```

Figure 74.42 Painting Observations on More Than One Plot
The REG Procedure
Model: MODEL1

 ``` -+-----+-----+-----+-----+-----+-- -+----+----+----+----+----+----+- | | | | 3 + + 0.8 + p + | | | | | | | | | p | | | 2 + + | | | | 0.6 + + S | | | | T | p p p | C | | U 1 + s + O | | D | | O | | E | s | K 0.4 + + N | s | D | | T 0 + s s + | | | s | | | | s s | | | | s | 0.2 + + -1 + + | p | | n n | | n s | | n n | | p n s | | | | n p ss | -2 + + 0.0 + ss ss s + -+-----+-----+-----+-----+-----+-- -+----+----+----+----+----+----+- 40 60 80 100 120 140 0.05 0.10 0.15 0.20 0.25 0.30 0.35 PRED H ``` Previous Page | Next Page | Top of Page