The TRANSREG Procedure

Smoothing Splines

You can use PROC TRANSREG to plot and output to a SAS data set the same smoothing spline function that the GPLOT procedure creates. You request a smoothing spline transformation by specifying SMOOTH in the MODEL statement. The smoothing parameter can be specified with either the SM= or the PARAMETER= o-option. The results are saved in the independent variable transformation (for example, Tx, when the independent variable is x) and the predicted values variable (for example, Py, when the dependent variable is y).

You can display the smoothing spline by using PROC TRANSREG and ODS Graphics (as shown in Figure 117.41). The following statements produce Figure 117.41:

title h=1.5 'Smoothing Splines';

ods graphics on;

data x;
   do x = 1 to 100 by 2;
      do rep = 1 to 3;
         y = log(x) + sin(x / 10) + normal(7);
         output;
      end;
   end;
run;

proc transreg;
   model identity(y) = smooth(x / sm=50);
   output p;
run;

Figure 117.41: Smoothing Spline Displayed with ODS Graphics

You can also use PROC GPLOT to verify that the two procedures produce the same results. The PROC GPLOT plot request y * x = 1 displays the data as stars. The specification y * x = 2 with I=SM50 requests the smooth curve through the scatter plot. It is overlaid with Py * x = 3, which displays with large dots the smooth function created by PROC TRANSREG. The results of the following step are not displayed:

proc gplot;
   axis1 minor=none label=(angle=90 rotate=0);
   axis2 minor=none;
   symbol1 color=blue v=circle i=none;  /* data              */
   symbol2 color=blue v=none   i=sm50;  /* gplot's smooth    */
   symbol3 color=red  v=dot    i=none;  /* transreg's smooth */
   plot y*x=1 y*x=2 py*x=3 / overlay haxis=axis2 vaxis=axis1 frame;
run; quit;

You can plot multiple nonlinear functions, one for each of several groups as defined by the levels of a CLASS variable. When you cross a SMOOTH variable with a CLASS variable, specify ZERO=NONE with the CLASS expansion. The following statements create artificial data and produce Figure 117.42:

title2 'Two Groups';

data x;
   do x = 1 to 100;
      Group = 1;
      do rep = 1 to 3;
         y = log(x) + sin(x / 10) + normal(7);
         output;
      end;
      group = 2;
      do rep = 1 to 3;
         y = -log(x) + cos(x / 10) + normal(7);
         output;
      end;
   end;
run;

proc transreg ss2 data=x;
   model identity(y) = class(group / zero=none) *
                       smooth(x / sm=50);
   output p;
run;

The ANOVA table in Figure 117.42 shows the overall model fit. The degrees of freedom are based on the trace of the transformation hat matrix, and are typically not integers. The "Smooth Transformation" table reports the degrees of freedom for each term, which includes an intercept for each group; the regression coefficients, which are always 1 with smoothing splines; the 0 to 100 smoothing parameter (like the one PROC GPLOT uses); the actual computed smoothing parameter; and the name and label for each term.

Figure 117.42: Smoothing Spline Example 2

Smoothing Splines

Two Groups

The TRANSREG Procedure

Dependent Variable Identity(y)

Class Level Information
Class	Levels	Values
Group	2	1 2

Number of Observations Read	600
Number of Observations Used	600
Implicit Intercept Model

The TRANSREG Procedure Hypothesis Tests for Identity(y)

Univariate ANOVA Table, Smooth Transformation
Source	DF	Sum of Squares	Mean Square	F Value	Pr > F
Model	16.794	9195.493	547.5365	562.03	<.0001
Error	582.21	567.195	0.9742
Corrected Total	599	9762.688

Root MSE	0.98702	R-Square	0.9419
Dependent Mean	0.03651	Adj R-Sq	0.9402
Coeff Var	2703.13908

Smooth Transformation
Variable	DF	Coefficient	SM	Parameter	Label
Smooth(Group1x)	8.8971	1.000	50	2405.265	Group 1 * x
Smooth(Group2x)	8.8971	1.000	50	2405.265	Group 2 * x

The SMOOTH transformation is valid only with independent variables. Typically, it is used only, as in the two preceding examples, in models with a single dependent variable, a single independent variable, and optionally, a single classification variable that is crossed with the independent variable. The various standardization options such as TSTANDARD= , CENTER , Z , and REFLECT are by default not permitted when the SMOOTH transformation is part of the model.

The SMOOTH transformation can also be used in other ways, but only when you specify the NSR a-option. (See the section Smoothing Splines Changes and Enhancements.) When you specify the NSR a-option, and there are multiple independent variables designated as SMOOTH, PROC TRANSREG tries to smooth the ith independent variable by using the ith dependent variable as a target. When there are more independent variables than dependent variables, the last dependent variable is reused as often as is necessary. For example, consider the following statements:

proc transreg nsr;
   model identity(y1-y3) = smooth(x1-x5);
run;

Smoothing is based on the pairs (y1, x1), (y2, x2), (y3, x3), (y3, x4), and (y3, x5).

The SMOOTH transformation is a noniterative transformation. The smoothing of each variable occurs before the iterations begin. In contrast, SSPLINE provides an iterative smoothing spline transformation. It does not generally minimize squared error; hence, divergence is possible with SSPLINE.