Example 12.2 Comparison of LMS, V7 LTS, and FAST-LTS
The following example presents comparisons of LMS, V7 LTS, and FAST-LTS. The data analyzed are the stackloss data of Brownlee (1965), which are also used for documenting the L1 regression module. The three explanatory variables correspond to measurements for a plant oxidizing ammonia to nitric acid on 21 consecutive days:
The response variable gives the permillage of ammonia lost (stackloss). The following data are also given in Rousseeuw and Leroy (1987) and Osborne (1985):
print "Stackloss Data";
aa = { 1 80 27 89 42,
1 80 27 88 37,
1 75 25 90 37,
1 62 24 87 28,
1 62 22 87 18,
1 62 23 87 18,
1 62 24 93 19,
1 62 24 93 20,
1 58 23 87 15,
1 58 18 80 14,
1 58 18 89 14,
1 58 17 88 13,
1 58 18 82 11,
1 58 19 93 12,
1 50 18 89 8,
1 50 18 86 7,
1 50 19 72 8,
1 50 19 79 8,
1 50 20 80 9,
1 56 20 82 15,
1 70 20 91 15 };
Rousseeuw and Leroy (1987) cite a large number of papers in which the preceding data set was analyzed. They state that most researchers "concluded that observations 1, 3, 4, and 21 were outliers" and that some people also reported observation 2 as an outlier.
Consider 2,000 Random Subsets for LMS
For and (three explanatory variables including intercept), you obtain a total of 5,985 different subsets of 4 observations out of 21. If you do not specify OPTN[5], the LMS algorithms draw random sample subsets. Since there is a large number of subsets with singular linear systems that you do not want to print, you can choose OPTN[2]=2 for reduced printed output, as in the following:
title2 "***Use 2000 Random Subsets for LMS***";
a = aa[,2:4]; b = aa[,5];
optn = j(9,1,.);
optn[2]= 2; /* ipri */
optn[3]= 3; /* ilsq */
optn[8]= 3; /* icov */
call lms(sc,coef,wgt,optn,b,a);
Summary statistics are shown in Output 12.2.1. Output 12.2.2 displays the results of LS regression. Output 12.2.3 displays the LMS results for the 2000 random subsets.
Output 12.2.1
Some Simple Statistics
58 |
60.428571429 |
20 |
21.095238095 |
87 |
86.285714286 |
1 |
1 |
15 |
17.523809524 |
5.930408874 |
9.1682682584 |
2.965204437 |
3.160771455 |
4.4478066555 |
5.3585712381 |
0 |
0 |
5.930408874 |
10.171622524 |
Output 12.2.2
Table of Unweighted LS Regression
0.7156402 |
0.13485819 |
5.31 |
<.0001 |
0.45132301 |
0.97995739 |
1.29528612 |
0.36802427 |
3.52 |
0.0026 |
0.57397182 |
2.01660043 |
-0.1521225 |
0.15629404 |
-0.97 |
0.3440 |
-0.4584532 |
0.15420818 |
-39.919674 |
11.8959969 |
-3.36 |
0.0038 |
-63.2354 |
-16.603949 |
0.0181867302 |
-0.036510675 |
-0.007143521 |
0.2875871057 |
-0.036510675 |
0.1354418598 |
0.0000104768 |
-0.651794369 |
-0.007143521 |
0.0000104768 |
0.024427828 |
-1.676320797 |
0.2875871057 |
-0.651794369 |
-1.676320797 |
141.51474107 |
Output 12.2.3
Iteration History and Optimization Results
500 |
23 |
0.163262 |
25 |
1000 |
55 |
0.140519 |
50 |
1500 |
79 |
0.140519 |
75 |
2000 |
103 |
0.126467 |
100 |
For LMS, observations 1, 3, 4, and 21 have scaled residuals larger than 2.5 (output not shown), and they are considered outliers. Output 12.2.4 displays the corresponding WLS results.
Output 12.2.4
Table of Weighted LS Regression
0.79768556 |
0.06743906 |
11.83 |
<.0001 |
0.66550742 |
0.9298637 |
0.57734046 |
0.16596894 |
3.48 |
0.0041 |
0.25204731 |
0.9026336 |
-0.0670602 |
0.06160314 |
-1.09 |
0.2961 |
-0.1878001 |
0.05367975 |
-37.652459 |
4.73205086 |
-7.96 |
<.0001 |
-46.927108 |
-28.37781 |
0.0045480273 |
-0.007921409 |
-0.001198689 |
0.0015681747 |
-0.007921409 |
0.0275456893 |
-0.00046339 |
-0.065017508 |
-0.001198689 |
-0.00046339 |
0.0037949466 |
-0.246102248 |
0.0015681747 |
-0.065017508 |
-0.246102248 |
22.392305355 |
Consider 2,000 Random Subsets for V7 LTS
The V7 LTS algorithm is similar to the LMS algorithm. Here is the code:
title2 "***Use 2000 Random Subsets for LTS***";
a = aa[,2:4]; b = aa[,5];
optn = j(9,1,.);
optn[2]= 2; /* ipri */
optn[3]= 3; /* ilsq */
optn[8]= 3; /* icov */
optn[9]= 1; /* V7 LTS */
call lts(sc,coef,wgt,optn,b,a);
Output 12.2.5
Iteration History and Optimization Results
500 |
23 |
0.099507 |
25 |
1000 |
55 |
0.087814 |
50 |
1500 |
79 |
0.084061 |
75 |
2000 |
103 |
0.084061 |
100 |
0.75 |
0.3333333333 |
0 |
-35.70512821 |
In addition to observations 1, 3, 4, and 21, which were considered outliers in LMS, observations 2 and 13 for LTS have absolute scaled residuals that are larger (but not as significantly as observations 1, 3, 4, and 21) than 2.5 (output not shown). Therefore, the WLS results based on LTS are different from those based on LMS.
Output 12.2.6 displays the results for the weighted LS regression.
Output 12.2.6
Table of Weighted LS Regression
0.75694055 |
0.07860766 |
9.63 |
<.0001 |
0.60287236 |
0.91100874 |
0.45353029 |
0.13605033 |
3.33 |
0.0067 |
0.18687654 |
0.72018405 |
-0.05211 |
0.05463722 |
-0.95 |
0.3607 |
-0.159197 |
0.054977 |
-34.05751 |
3.82881873 |
-8.90 |
<.0001 |
-41.561857 |
-26.553163 |
0.0061791648 |
-0.005776855 |
-0.002300587 |
-0.034290068 |
-0.005776855 |
0.0185096933 |
0.0002582502 |
-0.069740883 |
-0.002300587 |
0.0002582502 |
0.0029852254 |
-0.131487406 |
-0.034290068 |
-0.069740883 |
-0.131487406 |
14.659852903 |
Consider 500 Random Subsets for FAST-LTS
The FAST-LTS algorithm uses only 500 random subsets and gets better optimization results. Here is the code:
title2 "***Use 500 Random Subsets for FAST-LTS***";
a = aa[,2:4]; b = aa[,5];
optn = j(9,1,.);
optn[2]= 2; /* ipri */
optn[3]= 3; /* ilsq */
optn[8]= 3; /* icov */
optn[9]= 0; /* FAST-LTS */
call lts(sc,coef,wgt,optn,b,a);
For this example, the two LTS algorithms identify the same outliers; however, the FAST-LTS algorithm uses only 500 random subsets and gets a smaller objective function, as seen in Output 12.2.7. For large data sets, the advantages of the FAST-LTS algorithm are more obvious.
Output 12.2.7
Optimization Results for FAST-LTS
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
15 |
16 |
17 |
18 |
19 |
0.7409210642 |
0.3915267228 |
0.0111345398 |
-37.32332647 |
Consider All 5,985 Subsets
You now report the results of LMS for all different subsets. Here is the code:
title2 "*** Use All 5985 Subsets***";
a = aa[,2:4]; b = aa[,5];
optn = j(9,1,.);
optn[2]= 2; /* ipri */
optn[3]= 3; /* ilsq */
optn[5]= -1; /* nrep: all 5985 subsets */
optn[8]= 3; /* icov */
ods select IterHist0 BestSubset EstCoeff;
call lms(sc,coef,wgt,optn,b,a);
Output 12.2.8 displays the results for LMS.
Output 12.2.8
Iteration History and Optimization Results for LMS
1497 |
36 |
0.185899 |
25 |
2993 |
87 |
0.158268 |
50 |
4489 |
149 |
0.140519 |
75 |
5985 |
266 |
0.126467 |
100 |
0.75 |
0.5 |
1.29526E-16 |
-39.25 |
Next, report the results of LTS for all different subsets, as follows:
title2 "*** Use All 5985 Subsets***";
a = aa[,2:4]; b = aa[,5];
optn = j(9,1,.);
optn[2]= 2; /* ipri */
optn[3]= 3; /* ilsq */
optn[5]= -1; /* nrep: all 5985 subsets */
optn[8]= 3; /* icov */
optn[9]= 1; /* V7 LTS */
ods select IterHist0 BestSubset EstCoeff;
call lts(sc,coef,wgt,optn,b,a);
Output 12.2.9 displays the results for LTS.
Output 12.2.9
Iteration History and Optimization Results for LTS
1497 |
36 |
0.135449 |
25 |
2993 |
87 |
0.107084 |
50 |
4489 |
149 |
0.081536 |
75 |
5985 |
266 |
0.081536 |
100 |
0.7291666667 |
0.4166666667 |
-8.54713E-18 |
-36.22115385 |