| Language Reference |
performs robust regression

| n | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| 500 | 50 | 22 | 17 | 15 | 14 | 0 | 0 | 0 | 0 | |
| 1414 | 182 | 71 | 43 | 32 | 27 | 24 | 23 | 22 | ||
| 500 | 1000 | 1500 | 2000 | 2500 | 3000 | 3000 | 3000 | 3000 | 3000 |
| n | 11 | 12 | 13 | 14 | 15 |
| 0 | 0 | 0 | 0 | 0 | |
| 22 | 22 | 22 | 23 | 23 | |
| 3000 | 3000 | 3000 | 3000 | 3000 |
For
and
(three explanatory variables including
intercept), you obtain a total of 5985 different subsets of
4 observations out of 21. If you decide not to specify
optn[5], the LMS subroutine chooses
random sample subsets. Since there is a large number of subsets
with singular linear systems, which you do not want to print,
choose optn[2]=2 for reduced printed output. Here is the code:
/* X1 X2 X3 Y Stackloss data */
aa = { 1 80 27 89 42,
1 80 27 88 37,
1 75 25 90 37,
1 62 24 87 28,
1 62 22 87 18,
1 62 23 87 18,
1 62 24 93 19,
1 62 24 93 20,
1 58 23 87 15,
1 58 18 80 14,
1 58 18 89 14,
1 58 17 88 13,
1 58 18 82 11,
1 58 19 93 12,
1 50 18 89 8,
1 50 18 86 7,
1 50 19 72 8,
1 50 19 79 8,
1 50 20 80 9,
1 56 20 82 15,
1 70 20 91 15 };
a = aa[,2:4]; b = aa[,5]; optn = j(8,1,.); optn[2]= 2; /* ipri */ optn[3]= 3; /* ilsq */ optn[8]= 3; /* icov */ CALL LMS(sc,coef,wgt,optn,b,a);
The resulting output is as follows:
LMS: The 13th ordered squared residual will be minimized.
Median and Mean
Median Mean
VAR1 58 60.428571429
VAR2 20 21.095238095
VAR3 87 86.285714286
Intercep 1 1
Response 15 17.523809524
Dispersion and Standard Deviation
Dispersion StdDev
VAR1 5.930408874 9.1682682584
VAR2 2.965204437 3.160771455
VAR3 4.4478066555 5.3585712381
Intercep 0 0
Response 5.930408874 10.171622524
The following are the results of LS regression:
Unweighted Least-Squares Estimation
LS Parameter Estimates
Approx Pr >
Variable Estimate Std Err t Value |t|
VAR1 0.715640 0.134858 5.31 <.0001
VAR2 1.295286 0.368024 3.52 0.0026
VAR3 -0.152123 0.156294 -0.97 0.3440
Intercep -39.919674 11.895997 -3.36 0.0038
Variable Lower WCI Upper WCI
VAR1 0.451323 0.979957
VAR2 0.573972 2.016600
VAR3 -0.458453 0.154208
Intercep -63.2354 -16.603949
Sum of Squares = 178.8299616
Degrees of Freedom = 17
LS Scale Estimate = 3.2433639182
Cov Matrix of Parameter Estimates
VAR1 VAR2 VAR3 Intercep
VAR1 0.018187 -0.036511 0.007144 0.287587
VAR2 -0.036511 0.135442 0.000010 -0.651794
VAR3 -0.007144 0.000011 0.024428 -1.676321
Intercep 0.287587 -0.651794 1.676321 141.514741
R-squared = 0.9135769045
F(3,17) Statistic = 59.9022259
Probability = 3.0163272E-9
These are the LMS results for the 2,000 random subsets:
Random Subsampling for LMS
Best
Subset Singular Criterion Percent
500 23 0.163262 25
1000 55 0.140519 50
1500 79 0.140519 75
2000 103 0.126467 100
Minimum Criterion= 0.1264668282
Least Median of Squares (LMS) Method
Minimizing 13th Ordered Squared Residual.
Highest Possible Breakdown Value = 42.86 %
Random Selection of 2103 Subsets
Among 2103 subsets 103 are singular.
Observations of Best Subset
15 11 19 10
Estimated Coefficients
VAR1 VAR2 VAR3 Intercep
0.75 0.5 0 -39.25
LMS Objective Function = 0.75
Preliminary LMS Scale = 1.0478510755
Robust R Squared = 0.96484375
Final LMS Scale = 1.2076147288
For LMS observations, 1, 3, 4, and 21 have scaled residuals larger than 2.5 (table not shown) and are considered outliers. The corresponding WLS results are as follows:
Weighted Least-Squares Estimation
RLS Parameter Estimates Based on LMS
Approx Pr >
Variable Estimate Std Err t Value |t|
VAR1 0.797686 0.067439 11.83 <.0001
VAR2 0.577340 0.165969 3.48 0.0041
VAR3 -0.067060 0.061603 -1.09 0.2961
Intercep -37.652459 4.732051 -7.96 <.0001
Lower WCI Upper WCI
0.665507 0.929864
0.252047 0.902634
-0.187800 0.053680
-46.927108 -28.37781
Weighted Sum of Squares = 20.400800254
Degrees of Freedom = 13
RLS Scale Estimate = 1.2527139846
Cov Matrix of Parameter Estimates
VAR1 VAR2 VAR3 Intercep
VAR1 0.004548 -0.007921 -0.001199 0.001568
VAR2 -0.007921 0.027546 -0.000463 -0.065018
VAR3 -0.001199 -0.000463 0.003795 -0.246102
Intercep 0.001568 -0.065018 -0.246102 22.392305
Weighted R-squared = 0.9750062263
F(3,13) Statistic = 169.04317954
Probability = 1.158521E-10
There are 17 points with nonzero weight.
Average Weight = 0.8095238095
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.