The ORTHOREG Procedure

Example 66.2 Wampler Data

This example applies the ORTHOREG procedure to a collection of data sets noted for being ill-conditioned. The OUTEST= data set is used to collect the results for comparison with values certified to be correct by the National Institute of Standards and Technology (1998).

Note: The results from this example vary from machine to machine, depending on floating-point configuration.

The data are from Wampler (1970). The independent variates for all five data sets are $x^ i$, $i=1,\ldots 5,$ for $x=0,1,\ldots , 20$. Two of the five dependent variables are exact linear functions of the independent terms:

$\displaystyle  y_1  $
$\displaystyle  =  $
$\displaystyle  1 + x + x^2 + x^3 + x^4 + x^5  $
$\displaystyle y_2  $
$\displaystyle  =  $
$\displaystyle  1 + 0.1x + 0.01x^2 + 0.001x^3 + 0.0001x^4 + 0.00001x^5  $

The other three dependent variables have the same mean value as $y_1$, but with nonzero errors:

$\displaystyle  y_3  $
$\displaystyle  =  $
$\displaystyle  y_1 + \mb {e}  $
$\displaystyle y_4  $
$\displaystyle  =  $
$\displaystyle  y_1 + 100\mb {e}  $
$\displaystyle y_5  $
$\displaystyle  =  $
$\displaystyle  y_1 + 10000\mb {e}  $

where $\mb {e}$ is a vector of values with standard deviation $\sim $2044, chosen to be orthogonal to the mean model for $y_1$.

The following statements create a SAS data set Wampler that contains the Wampler data, run a SAS macro program that uses PROC ORTHOREG to fit a fifth-order polynomial in x to each of the Wampler dependent variables, and collect the results in a data set named ParmEst:

data Wampler;
   do x=0 to 20;
      input e @@;
      y1 = 1 +       x    +        x**2 +      x**3
             +       x**4 +        x**5;
      y2 = 1 + .1   *x    + .01   *x**2 + .001*x**3
             + .0001*x**4 + .00001*x**5;
      y3 = y1 +       e;
      y4 = y1 +   100*e;
      y5 = y1 + 10000*e;
      output;
   end;
   datalines;
759 -2048 2048 -2048 2523 -2048 2048 -2048 1838 -2048 2048
-2048 1838 -2048 2048 -2048 2523 -2048 2048 -2048 759
;
%macro WTest;
   data ParmEst; if (0); run;
   %do i = 1 %to 5;
      proc orthoreg data=Wampler outest=ParmEst&i noprint;
         model y&i = x x*x x*x*x x*x*x*x x*x*x*x*x;
      data ParmEst&i; set ParmEst&i; Dep = "y&i";
      data ParmEst; set ParmEst ParmEst&i;
         label Col1='x'     Col2='x**2'  Col3='x**3'
               Col4='x**4'  Col5='x**5';
      run;
   %end;
%mend;
%WTest;

Instead of displaying the raw values of the RMSE and parameter estimates, use an additional DATA step as follows to compute the deviations from the values certified to be correct by the National Institute of Standards and Technology (1998):

data ParmEst; set ParmEst;
   if      (Dep = 'y1') then
      _RMSE_ = _RMSE_ - 0.00000000000000;
   else if (Dep = 'y2') then
      _RMSE_ = _RMSE_ - 0.00000000000000;
   else if (Dep = 'y3') then
      _RMSE_ = _RMSE_ - 2360.14502379268;
   else if (Dep = 'y4') then
      _RMSE_ = _RMSE_ - 236014.502379268;
   else if (Dep = 'y5') then
      _RMSE_ = _RMSE_ - 23601450.2379268;
   if (Dep ^= 'y2') then do;
      Intercept = Intercept - 1.00000000000000;
      Col1      = Col1      - 1.00000000000000;
      Col2      = Col2      - 1.00000000000000;
      Col3      = Col3      - 1.00000000000000;
      Col4      = Col4      - 1.00000000000000;
      Col5      = Col5      - 1.00000000000000;
   end;
   else do;
      Intercept = Intercept - 1.00000000000000;
      Col1      = Col1      - 0.100000000000000;
      Col2      = Col2      - 0.100000000000000e-1;
      Col3      = Col3      - 0.100000000000000e-2;
      Col4      = Col4      - 0.100000000000000e-3;
      Col5      = Col5      - 0.100000000000000e-4;
   end;
run;
proc print data=ParmEst label noobs;
   title 'Wampler data: Deviations from Certified Values';
   format _RMSE_ Intercept Col1-Col5 e9.;
   var Dep _RMSE_ Intercept Col1-Col5;
run;

The results, shown in Output 66.2.1, indicate that the values computed by PROC ORTHOREG are quite close to the NIST-certified values.

Output 66.2.1: Wampler Data: Deviations from Certified Values

Wampler data: Deviations from Certified Values

Dep _RMSE_ Intercept x x**2 x**3 x**4 x**5
y1 0.00E+00 5.46E-12 -9.82E-11 1.55E-11 -5.68E-13 3.55E-14 -6.66E-16
y2 0.00E+00 8.88E-16 -3.19E-15 1.24E-15 -1.88E-16 1.20E-17 -2.57E-19
y3 -2.09E-11 -7.73E-11 1.46E-11 -2.09E-11 2.50E-12 -1.28E-13 2.66E-15
y4 -4.07E-10 -5.38E-10 8.99E-10 -3.29E-10 4.23E-11 -2.27E-12 4.35E-14
y5 -3.35E-08 -4.10E-08 8.07E-08 -2.77E-08 3.54E-09 -1.90E-10 3.64E-12