The REG Procedure

Reweighting Observations in an Analysis

Reweighting observations is an interactive feature of PROC REG that enables you to change the weights of observations used in computing the regression equation. Observations can also be deleted from the analysis (not from the data set) by changing their weights to zero. In the following statements, the Sashelp.Class data are used to illustrate some of the features of the REWEIGHT statement. First, the full model is fit, and the residuals are displayed in Figure 97.43.

proc reg data=sashelp.Class;
   model Weight=Age Height / p;
   id Name;
run;

Figure 97.43: Full Model for Sashelp.Class Data, Residuals Shown

The REG Procedure
Model: MODEL1
Dependent Variable: Weight

Output Statistics
Obs Name Dependent
Variable
Predicted
Value
Residual
1 Alfred 112.5 124.8686 -12.3686
2 Alice 84.0 78.6273 5.3727
3 Barbara 98.0 110.2812 -12.2812
4 Carol 102.5 102.5670 -0.0670
5 Henry 102.5 105.0849 -2.5849
6 James 83.0 80.2266 2.7734
7 Jane 84.5 89.2191 -4.7191
8 Janet 112.5 102.7663 9.7337
9 Jeffrey 84.0 100.2095 -16.2095
10 John 99.5 86.3415 13.1585
11 Joyce 50.5 57.3660 -6.8660
12 Judy 90.0 107.9625 -17.9625
13 Louise 77.0 76.6295 0.3705
14 Mary 112.0 117.1544 -5.1544
15 Philip 150.0 138.2164 11.7836
16 Robert 128.0 107.2043 20.7957
17 Ronald 133.0 118.9529 14.0471
18 Thomas 85.0 79.6676 5.3324
19 William 112.0 117.1544 -5.1544

Sum of Residuals 0
Sum of Squared Residuals 2120.09974
Predicted Residual SS (PRESS) 3272.72186



Upon examining the data and residuals, you realize that observation 17 (Ronald) was mistakenly included in the analysis. Also, you would like to examine the effect of reweighting to 0.5 those observations with residuals that have absolute values greater than or equal to 17. The following statements show how you request this reweighting:

reweight obs.=17;
reweight r. le -17 or r. ge 17 / weight=0.5;
print p;
run;

At this point, a message appears (in the log) that tells you which observations have been reweighted and what the new weights are. Figure 97.44 is produced.

Figure 97.44: Model with Reweighted Observations

The REG Procedure
Model: MODEL1.2
Dependent Variable: Weight

Output Statistics
Obs Name Weight Dependent
Variable
Predicted
Value
Residual
1 Alfred 1.0 112.5 121.6250 -9.1250
2 Alice 1.0 84.0 79.9296 4.0704
3 Barbara 1.0 98.0 107.5484 -9.5484
4 Carol 1.0 102.5 102.1663 0.3337
5 Henry 1.0 102.5 104.3632 -1.8632
6 James 1.0 83.0 79.9762 3.0238
7 Jane 1.0 84.5 87.8225 -3.3225
8 Janet 1.0 112.5 103.6889 8.8111
9 Jeffrey 1.0 84.0 98.7606 -14.7606
10 John 1.0 99.5 85.3117 14.1883
11 Joyce 1.0 50.5 58.6811 -8.1811
12 Judy 0.5 90.0 106.8740 -16.8740
13 Louise 1.0 77.0 76.8377 0.1623
14 Mary 1.0 112.0 116.2429 -4.2429
15 Philip 1.0 150.0 135.9688 14.0312
16 Robert 0.5 128.0 103.5150 24.4850
17 Ronald 0.0 133.0 117.8121 15.1879
18 Thomas 1.0 85.0 78.1398 6.8602
19 William 1.0 112.0 116.2429 -4.2429

Sum of Residuals 0
Sum of Squared Residuals 1500.61194
Predicted Residual SS (PRESS) 2287.57621



The first REWEIGHT statement excludes observation 17, and the second REWEIGHT statement reweights observations 12 and 16 to 0.5. An important feature to note from this example is that the model is not refit until after the PRINT statement. REWEIGHT statements do not cause the model to be refit. This is so that multiple REWEIGHT statements can be applied to a subsequent model.

In this example, since the intent is to reweight observations with large residuals, the observation that was mistakenly included in the analysis should be deleted; then the model should be fit for those remaining observations, and the observations with large residuals should be reweighted. To accomplish this, use the REFIT statement. Note that the model label has been changed from MODEL1 to MODEL1.2 since two REWEIGHT statements have been used. The following statements produce Figure 97.45:

reweight allobs / weight=1.0;
reweight obs.=17;
refit;
reweight r. le -17 or r. ge 17 / weight=.5;
print;
run;

Figure 97.45: Observations Excluded from Analysis, Model Refitted, and Observations Reweighted

The REG Procedure
Model: MODEL1.5
Dependent Variable: Weight

Output Statistics
Obs Name Weight Dependent
Variable
Predicted
Value
Residual
1 Alfred 1.0 112.5 120.9716 -8.4716
2 Alice 1.0 84.0 79.5342 4.4658
3 Barbara 1.0 98.0 107.0746 -9.0746
4 Carol 1.0 102.5 101.5681 0.9319
5 Henry 1.0 102.5 103.7588 -1.2588
6 James 1.0 83.0 79.7204 3.2796
7 Jane 1.0 84.5 87.5443 -3.0443
8 Janet 1.0 112.5 102.9467 9.5533
9 Jeffrey 1.0 84.0 98.3117 -14.3117
10 John 1.0 99.5 85.0407 14.4593
11 Joyce 1.0 50.5 58.6253 -8.1253
12 Judy 1.0 90.0 106.2625 -16.2625
13 Louise 1.0 77.0 76.5908 0.4092
14 Mary 1.0 112.0 115.4651 -3.4651
15 Philip 1.0 150.0 134.9953 15.0047
16 Robert 0.5 128.0 103.1923 24.8077
17 Ronald 0.0 133.0 117.0299 15.9701
18 Thomas 1.0 85.0 78.0288 6.9712
19 William 1.0 112.0 115.4651 -3.4651

Sum of Residuals 0
Sum of Squared Residuals 1637.81879
Predicted Residual SS (PRESS) 2473.87984



Notice that this results in a slightly different model than the previous set of statements: only observation 16 is reweighted to 0.5. Also note that the model label is now MODEL1.5 since five REWEIGHT statements have been used for this model.

Another important feature of the REWEIGHT statement is the ability to nullify the effect of a previous or all REWEIGHT statements. First, assume that you have several REWEIGHT statements in effect and you want to restore the original weights of all the observations. The following REWEIGHT statement accomplishes this and produces Figure 97.46:

reweight allobs / reset;
print;
run;

Figure 97.46: Restoring Weights of All Observations

The REG Procedure
Model: MODEL1.6
Dependent Variable: Weight

Output Statistics
Obs Name Dependent
Variable
Predicted
Value
Residual
1 Alfred 112.5 124.8686 -12.3686
2 Alice 84.0 78.6273 5.3727
3 Barbara 98.0 110.2812 -12.2812
4 Carol 102.5 102.5670 -0.0670
5 Henry 102.5 105.0849 -2.5849
6 James 83.0 80.2266 2.7734
7 Jane 84.5 89.2191 -4.7191
8 Janet 112.5 102.7663 9.7337
9 Jeffrey 84.0 100.2095 -16.2095
10 John 99.5 86.3415 13.1585
11 Joyce 50.5 57.3660 -6.8660
12 Judy 90.0 107.9625 -17.9625
13 Louise 77.0 76.6295 0.3705
14 Mary 112.0 117.1544 -5.1544
15 Philip 150.0 138.2164 11.7836
16 Robert 128.0 107.2043 20.7957
17 Ronald 133.0 118.9529 14.0471
18 Thomas 85.0 79.6676 5.3324
19 William 112.0 117.1544 -5.1544

Sum of Residuals 0
Sum of Squared Residuals 2120.09974
Predicted Residual SS (PRESS) 3272.72186



The resulting model is identical to the original model specified at the beginning of this section. Notice that the model label is now MODEL1.6. Note that the Weight column does not appear, since all observations have been reweighted to have weight=1.

Now suppose you want only to undo the changes made by the most recent REWEIGHT statement. Use REWEIGHT UNDO for this. The following statements produce Figure 97.47:

reweight r. le -12 or r. ge 12 / weight=.75;
reweight r. le -17 or r. ge 17 / weight=.5;
reweight undo;
print;
run;

Figure 97.47: Example of UNDO in REWEIGHT Statement

The REG Procedure
Model: MODEL1.9
Dependent Variable: Weight

Output Statistics
Obs Name Weight Dependent
Variable
Predicted
Value
Residual
1 Alfred 0.75 112.5 125.1152 -12.6152
2 Alice 1.00 84.0 78.7691 5.2309
3 Barbara 0.75 98.0 110.3236 -12.3236
4 Carol 1.00 102.5 102.8836 -0.3836
5 Henry 1.00 102.5 105.3936 -2.8936
6 James 1.00 83.0 80.1133 2.8867
7 Jane 1.00 84.5 89.0776 -4.5776
8 Janet 1.00 112.5 103.3322 9.1678
9 Jeffrey 0.75 84.0 100.2835 -16.2835
10 John 0.75 99.5 86.2090 13.2910
11 Joyce 1.00 50.5 57.0745 -6.5745
12 Judy 0.75 90.0 108.2622 -18.2622
13 Louise 1.00 77.0 76.5275 0.4725
14 Mary 1.00 112.0 117.6752 -5.6752
15 Philip 1.00 150.0 138.9211 11.0789
16 Robert 0.75 128.0 107.0063 20.9937
17 Ronald 0.75 133.0 119.4681 13.5319
18 Thomas 1.00 85.0 79.3061 5.6939
19 William 1.00 112.0 117.6752 -5.6752

Sum of Residuals 0
Sum of Squared Residuals 1694.87114
Predicted Residual SS (PRESS) 2547.22751



The resulting model reflects changes made only by the first REWEIGHT statement since the third REWEIGHT statement negates the effect of the second REWEIGHT statement. Observations 1, 3, 9, 10, 12, 16, and 17 have their weights changed to 0.75. Note that the label MODEL1.9 reflects the use of nine REWEIGHT statements for the current model.

Now suppose you want to reset the observations selected by the most recent REWEIGHT statement to their original weights. Use the REWEIGHT statement with the RESET option to do this. The following statements produce Figure 97.48:

reweight r. le -12 or r. ge 12 / weight=.75;
reweight r. le -17 or r. ge 17 / weight=.5;
reweight / reset;
print;
run;

Figure 97.48: REWEIGHT Statement with RESET option

The REG Procedure
Model: MODEL1.12
Dependent Variable: Weight

Output Statistics
Obs Name Weight Dependent
Variable
Predicted
Value
Residual
1 Alfred 0.75 112.5 126.0076 -13.5076
2 Alice 1.00 84.0 77.8727 6.1273
3 Barbara 0.75 98.0 111.2805 -13.2805
4 Carol 1.00 102.5 102.4703 0.0297
5 Henry 1.00 102.5 105.1278 -2.6278
6 James 1.00 83.0 80.2290 2.7710
7 Jane 1.00 84.5 89.7199 -5.2199
8 Janet 1.00 112.5 102.0122 10.4878
9 Jeffrey 0.75 84.0 100.6507 -16.6507
10 John 0.75 99.5 86.6828 12.8172
11 Joyce 1.00 50.5 56.7703 -6.2703
12 Judy 1.00 90.0 108.1649 -18.1649
13 Louise 1.00 77.0 76.4327 0.5673
14 Mary 1.00 112.0 117.1975 -5.1975
15 Philip 1.00 150.0 138.7581 11.2419
16 Robert 1.00 128.0 108.7016 19.2984
17 Ronald 0.75 133.0 119.0957 13.9043
18 Thomas 1.00 85.0 80.3076 4.6924
19 William 1.00 112.0 117.1975 -5.1975

Sum of Residuals 0
Sum of Squared Residuals 1879.08980
Predicted Residual SS (PRESS) 2959.57279



Note that observations that meet the condition of the second REWEIGHT statement (residuals with an absolute value greater than or equal to 17) now have weights reset to their original value of 1. Observations 1, 3, 9, 10, and 17 have weights of 0.75, but observations 12 and 16 (which meet the condition of the second REWEIGHT statement) have their weights reset to 1.

Notice how the last three examples show three ways to change weights back to a previous value. In the first example, ALLOBS and the RESET option are used to change weights for all observations back to their original values. In the second example, the UNDO option is used to negate the effect of a previous REWEIGHT statement, thus changing weights for observations selected in the previous REWEIGHT statement to the weights specified in still another REWEIGHT statement. In the third example, the RESET option is used to change weights for observations selected in a previous REWEIGHT statement back to their original values. Finally, note that the label MODEL1.12 indicates that 12 REWEIGHT statements have been applied to the original model.