|
Chapter Contents |
Previous |
Next |
| The TRANSREG Procedure |
This example shows how to make graphical displays of the Box-Cox transformation results. Plots include the log likelihood function with the confidence interval, root mean squared error as a function of the power parameter, R2 as a function of the power parameter, the Box-Cox transformation of the variable y, the original scatter plot based on the untransformed data, and the new scatter plot based on the transformed data. Also, a condensed version of the log likelihood table with the confidence interval is printed.
title h=1 'Box-Cox Graphical Displays';
data x;
input y x @@;
datalines;
10.0 3.0 72.6 8.3 59.7 8.1 20.1 4.8 90.1 9.8 1.1 0.9
78.2 8.5 87.4 9.0 9.5 3.4 0.1 1.4 0.1 1.1 42.5 5.1
57.0 7.5 9.9 1.9 0.5 1.0 121.1 9.9 37.5 5.9 49.5 6.7
8.3 1.8 0.6 1.8 53.0 6.7 112.8 10.0 40.7 6.4 5.1 2.4
73.3 9.5 122.4 9.9 87.2 9.4 121.2 9.9 23.1 4.3 7.1 3.5
12.4 3.3 5.6 2.7 113.0 9.6 110.5 10.0 3.1 1.5 52.4 7.9
80.4 8.1 0.6 1.6 115.1 9.1 15.9 3.1 56.5 7.3 85.4 9.8
32.5 5.8 43.0 6.2 0.1 0.8 21.8 5.2 15.2 3.5 5.2 3.0
0.2 0.8 73.5 8.2 4.9 3.2 0.2 0.3 69.0 9.2 3.6 3.5
0.2 0.9 101.3 9.9 10.0 3.7 16.9 3.0 11.2 5.0 0.2 0.4
80.8 9.4 24.9 5.7 113.5 9.7 6.2 2.1 12.5 3.2 4.8 1.8
80.1 8.3 26.4 4.8 13.4 3.8 99.8 9.7 44.1 6.2 15.3 3.8
2.2 1.5 10.3 2.7 13.8 4.7 38.6 4.5 79.1 9.8 33.6 5.8
9.1 4.5 89.3 9.1 5.5 2.6 20.0 4.8 2.9 2.9 82.9 8.4
7.0 3.5 14.5 2.9 16.0 3.7 29.3 6.1 48.9 6.3 1.6 1.9
34.7 6.2 33.5 6.5 26.0 5.6 12.7 3.1 0.1 0.3 15.4 4.2
2.6 1.8 58.6 7.9 81.2 8.1 37.2 6.9
;
The TRANSREG procedure is run to find the Box-Cox transformation. The lambda list is -2 TO 2 BY 0.01, which produces 401 lambdas. This many power parameters makes a nice graphical display with plenty of detail around the confidence interval. However, 401 values is a lot to print, so for this reason, the usual Box-Cox transformation information table is excluded from the printed output. Instead, it is output to a SAS data set using ODS so a sample of it can be printed. Just the confidence interval and the rows corresponding to power parameters that are multiples of 0.5 are printed. Null labels are provided for the columns that need to be printed without headers. The details table is also output to a SAS data set using ODS, since it contains information that will be incorporated into some of the plots.
* Fit Box-Cox model, output results to output data sets;
ods output boxcox=b details=d;
ods exclude boxcox;
proc transreg details data=x;
model boxcox(y / convenient lambda=-2 to 2 by 0.01) = identity(x);
output out=trans;
run;
proc print noobs label data=b(drop=rmse);
title2 'Confidence Interval';
where ci ne ' ' or abs(lambda - round(lambda, 0.5)) < 1e-6;
label convenient = '00'x ci = '00'x;
run;
Output 15.2.1: Box-Cox Graphical Displays
|
* Store values for reference lines;
data _null_;
set d;
if description = 'CI Limit'
then call symput('vref', formattedvalue);
if description = 'Lambda Used'
then call symput('lambda', formattedvalue);
run;
data _null_;
set b end=eof;
where ci ne ' ';
if _n_ = 1
then call symput('href1', compress(put(lambda, best12.)));
if ci = '<'
then call symput('href2', compress(put(lambda, best12.)));
if eof
then call symput('href3', compress(put(lambda, best12.)));
run;
These steps plot the log likelihood, root mean square error, and R2. The input data set is the Box-Cox transformation table, which was output using ODS.
* Plot log likelihood, confidence interval;
axis1 label=(angle=90 rotate=0) minor=none;
axis2 minor=none;
proc gplot data=b;
title2 'Log Likelihood';
plot loglike * lambda / vref=&vref href=&href1 &href2 &href3
vaxis=axis1 haxis=axis2 frame cframe=ligr;
footnote "Confidence Interval: &href1 - &href2 - &href3, "
"Lambda = &lambda";
symbol v=none i=spline c=blue;
run;
footnote;
title2 'RMSE';
plot rmse * lambda / vaxis=axis1 haxis=axis2 frame cframe=ligr;
run;
title2 'R-Square';
plot rsquare * lambda / vaxis=axis1 haxis=axis2 frame cframe=ligr;
axis1 order=(0 to 1 by 0.1) label=(angle=90 rotate=0) minor=none;
run; quit;
Output 15.2.2: Box-Cox Graphical Displays
|
|
|
The next steps plot the transformation of y, the original scatter plot based on the untransformed data, and the new scatter plot based on the transformed data. The input data set is the ordinary output data set from PROC TRANSREG. The transformation of the variable y by default is ty.
axis1 label=(angle=90 rotate=0) minor=none;
axis2 minor=none;
proc gplot data=trans;
title2 'Transformation';
symbol i=splines v=star c=blue;
plot ty * y / vaxis=axis1 haxis=axis2 frame cframe=ligr;
run;
title2 'Original Scatter Plot';
symbol i=none v=star c=blue;
plot y * x / vaxis=axis1 haxis=axis2 frame cframe=ligr;
run;
title2 'Transformed Scatter Plot';
symbol i=none v=star c=blue;
plot ty * x / vaxis=axis1 haxis=axis2 frame cframe=ligr;
run; quit;
Output 15.2.3: Box-Cox Graphical Displays
|
|
|
|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.