Robust Regression Examples

Example 9.3: LMS and LTS Univariate (Location) Problem: Barnett and Lewis Data

If you do not specify matrix x of the last input argument, the regression problem is reduced to the estimation problem of the location parameter a. The following example is described in Rousseeuw and Leroy (1987, p. 175):

  
    print "*** Barnett and Lewis (1978) ***"; 
       b = { 3, 4, 7, 8, 10, 949, 951 }; 
  
       optn = j(9,1,.); 
       optn[2]= 3;    /* ipri */ 
       optn[3]= 3;    /* ilsq */ 
       optn[8]= 3;    /* icov */ 
  
    call lms(sc,coef,wgt,optn,b);
 

Output 9.3.1 shows the results of the unweighted LS regression.

Output 9.3.1: Table of Unweighted LS Regression
Robust Estimation of Location and Scale

LMS: The 4th ordered squared residual will be minimized.

Unweighted Least-Squares Estimation

Median = 8 MAD ( * 1.4826) = 5.930408874

Mean = 276 Standard Deviation = 460.43602523

LS Residuals
N Observed Residual Res / S
1 3.000000 -273.000000 -0.592916
2 4.000000 -272.000000 -0.590744
3 7.000000 -269.000000 -0.584229
4 8.000000 -268.000000 -0.582057
5 10.000000 -266.000000 -0.577713
6 949.000000 673.000000 1.461658
7 951.000000 675.000000 1.466002

Distribution of Residuals

MinRes 1st Qu. Median Mean 3rd Qu. MaxRes
-273 -272 -268 0 -266 675



Output 9.3.2 shows the results for LMS regression.

Output 9.3.2: Table of LMS Results
Least Median of Squares (LMS) Method

Minimizing 4th Ordered Squared Residual.

Highest Possible Breakdown Value = 57.14 %

LMS Objective Function = 2.5

LMS Location = 5.5

Preliminary LMS Scale = 5.4137257125

Final LMS Scale = 3.0516389039

LMS Residuals
N Observed Residual Res / S
1 3.000000 -2.500000 -0.819232
2 4.000000 -1.500000 -0.491539
3 7.000000 1.500000 0.491539
4 8.000000 2.500000 0.819232
5 10.000000 4.500000 1.474617
6 949.000000 943.500000 309.178127
7 951.000000 945.500000 309.833512

Distribution of Residuals

MinRes 1st Qu. Median Mean 3rd Qu. MaxRes
-2.5 -1.5 2.5 270.5 4.5 945.5



You obtain the LMS location estimate 6.5 compared with the mean 276 (which is the LS estimate of the location parameter) and the median 8. The scale estimate in the univariate problem is a resistant (high breakdown) estimator for the dispersion of the data (see Rousseeuw and Leroy 1987, p. 178).

For weighted LS regression, the last two observations are ignored (that is, given zero weights), as shown in Output 9.3.3.

Output 9.3.3: Table of Weighted LS Regression
Weighted Least-Squares Estimation

Weighted Mean = 6.4

Weighted Standard Deviation = 2.8809720582

There are 5 points with nonzero weight.

Average Weight = 0.7142857143

Weighted LS Residuals
N Observed Residual Res / S Weight
1 3.000000 -3.400000 -1.180157 1.000000
2 4.000000 -2.400000 -0.833052 1.000000
3 7.000000 0.600000 0.208263 1.000000
4 8.000000 1.600000 0.555368 1.000000
5 10.000000 3.600000 1.249578 1.000000
6 949.000000 942.600000 327.181236 0
7 951.000000 944.600000 327.875447 0

Distribution of Residuals

MinRes 1st Qu. Median Mean 3rd Qu. MaxRes
-3.4 -2.4 1.6 269.6 3.6 944.6



Use the following code to obtain results from LTS:

  
       optn = j(9,1,.); 
       optn[2]= 3;    /* ipri */ 
       optn[3]= 3;    /* ilsq */ 
       optn[8]= 3;    /* icov */ 
  
    call lts(sc,coef,wgt,optn,b);
 
The results for LTS are similar to those reported for LMS in Rousseeuw and Leroy (1987), as shown in Output 9.3.4.

Output 9.3.4: Table of LTS Results
Least Trimmed Squares (LTS) Method

Minimizing Sum of 4 Smallest Squared Residuals.

Highest Possible Breakdown Value = 57.14 %

LTS Objective Function = 2.0615528128

LTS Location = 5.5

Preliminary LTS Scale = 4.7050421234

Final LTS Scale = 3.0516389039

LTS Residuals
N Observed Residual Res / S
1 3.000000 -2.500000 -0.819232
2 4.000000 -1.500000 -0.491539
3 7.000000 1.500000 0.491539
4 8.000000 2.500000 0.819232
5 10.000000 4.500000 1.474617
6 949.000000 943.500000 309.178127
7 951.000000 945.500000 309.833512

Distribution of Residuals

MinRes 1st Qu. Median Mean 3rd Qu. MaxRes
-2.5 -1.5 2.5 270.5 4.5 945.5



Since nonzero weights are chosen for the same observations as with LMS, the WLS results based on LTS agree with those based on LMS (shown previously in Output 9.3.3).

In summary, you obtain the following estimates for the location parameter:

Previous Page | Next Page | Top of Page