In this example, the data are from an experiment in which nitrogen oxide emissions from a single cylinder engine are measured for various combinations of fuel, compression ratio, and equivalence ratio. The data are provided by Brinkman (1981). This gas data set is available from the Sashelp library.
The equivalence ratio and nitrogen oxide variables are continuous and numeric, so spline transformations of these variables are requested. The spline transformation of the dependent variable is restricted to be monotonic. Each spline is degree three with nine knots (one at each decile) in order to give PROC TRANSREG a great deal of freedom in finding transformations. The compression ratio variable has only five discrete values, so an optimal scoring is requested with monotonicity constraints. The character variable Fuel is nominal, so it is optimally scored without any monotonicity constraints. Observations with missing values are excluded with the NOMISS a-option.
ods graphics on; title 'Gasoline Example'; title2 'Iteratively Estimate NOx, CpRatio, EqRatio, and Fuel'; * Fit the Nonparametric Model; proc transreg data=sashelp.Gas solve test nomiss plots=all; ods exclude where=(_path_ ? 'MV'); model mspline(NOx / nknots=9) = spline(EqRatio / nknots=9) monotone(CpRatio) opscore(Fuel); run;
Gasoline Example |
Iteratively Estimate NOx, CpRatio, EqRatio, and Fuel |
Number of Observations Read | 171 |
---|---|
Number of Observations Used | 169 |
TRANSREG MORALS Algorithm Iteration History for Mspline(NOx) | |||||
---|---|---|---|---|---|
Iteration Number |
Average Change |
Maximum Change |
R-Square | Criterion Change |
Note |
0 | 0.41900 | 3.80550 | 0.05241 | ||
1 | 0.11984 | 0.83327 | 0.91028 | 0.85787 | |
2 | 0.03727 | 0.17688 | 0.93981 | 0.02953 | |
3 | 0.02795 | 0.10880 | 0.94969 | 0.00987 | |
4 | 0.02088 | 0.07279 | 0.95382 | 0.00413 | |
5 | 0.01530 | 0.05031 | 0.95582 | 0.00201 | |
6 | 0.01130 | 0.03922 | 0.95688 | 0.00106 | |
7 | 0.00852 | 0.03197 | 0.95748 | 0.00060 | |
8 | 0.00657 | 0.02531 | 0.95783 | 0.00035 | |
9 | 0.00510 | 0.01975 | 0.95805 | 0.00022 | |
10 | 0.00398 | 0.01534 | 0.95818 | 0.00013 | |
11 | 0.00314 | 0.01200 | 0.95827 | 0.00009 | |
12 | 0.00250 | 0.00953 | 0.95832 | 0.00005 | |
13 | 0.00199 | 0.00752 | 0.95836 | 0.00003 | |
14 | 0.00159 | 0.00594 | 0.95838 | 0.00002 | |
15 | 0.00127 | 0.00470 | 0.95839 | 0.00001 | |
16 | 0.00102 | 0.00373 | 0.95840 | 0.00001 | |
17 | 0.00081 | 0.00297 | 0.95841 | 0.00001 | |
18 | 0.00065 | 0.00237 | 0.95841 | 0.00000 | |
19 | 0.00052 | 0.00189 | 0.95841 | 0.00000 | |
20 | 0.00042 | 0.00151 | 0.95842 | 0.00000 | |
21 | 0.00033 | 0.00120 | 0.95842 | 0.00000 | |
22 | 0.00027 | 0.00096 | 0.95842 | 0.00000 | |
23 | 0.00021 | 0.00077 | 0.95842 | 0.00000 | |
24 | 0.00017 | 0.00061 | 0.95842 | 0.00000 | |
25 | 0.00014 | 0.00049 | 0.95842 | 0.00000 | |
26 | 0.00011 | 0.00039 | 0.95842 | 0.00000 | |
27 | 0.00009 | 0.00031 | 0.95842 | 0.00000 | |
28 | 0.00007 | 0.00025 | 0.95842 | 0.00000 | |
29 | 0.00006 | 0.00020 | 0.95842 | 0.00000 | |
30 | 0.00005 | 0.00016 | 0.95842 | 0.00000 | Not Converged |
WARNING: Failed to converge, however criterion change is less than 0.0001. |
Univariate ANOVA Table Based on the Usual Degrees of Freedom | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares | Mean Square | F Value | Liberal p |
Model | 21 | 326.0176 | 15.52465 | 161.35 | >= <.0001 |
Error | 147 | 14.1443 | 0.09622 | ||
Corrected Total | 168 | 340.1619 | |||
The above statistics are not adjusted for the fact that the dependent variable was transformed and so are generally liberal. |
Root MSE | 0.31019 | R-Square | 0.9584 |
---|---|---|---|
Dependent Mean | 2.34593 | Adj R-Sq | 0.9525 |
Coeff Var | 13.22262 |
The squared multiple correlation for the initial model is approximately 0.05. PROC TRANSREG increases the R square to over 0.95 by transforming the variables. The transformation plots show how each variable is transformed. The transformation of compression ratio (TCpRatio) is nearly linear. The transformation of equivalence ratio (TEqRatio) is nearly parabolic. It can be seen from this plot that the optimal transformation of equivalence ratio is nearly uncorrelated with the original scoring. This suggests that the large increase in R square is due to this transformation. The transformation of nitrogen oxide (TNOx) is similar to a log transformation. The final plot shows the transformed dependent variable plotted as a function of the predicted values. This plot is reasonably linear, showing that the nonlinearities in the data are being accounted for fairly well by the TRANSREG model.
These results suggest the parametric model
You can perform this analysis with PROC TRANSREG. The following statements produce Output 93.1.2:
title2 'Now fit log(NOx) = b0 + b1*EqRatio + b2*EqRatio**2 +'; title3 'b3*CpRatio + Sum b(j)*Fuel(j) + Error'; *-Fit the Parametric Model Suggested by the Nonparametric Analysis-; proc transreg data=sashelp.Gas solve ss2 short nomiss plots=all; model log(NOx) = pspline(EqRatio / deg=2) identity(CpRatio) opscore(Fuel); run;
Gasoline Example |
Now fit log(NOx) = b0 + b1*EqRatio + b2*EqRatio**2 + |
b3*CpRatio + Sum b(j)*Fuel(j) + Error |
Number of Observations Read | 171 |
---|---|
Number of Observations Used | 169 |
Log(NOx) |
---|
Algorithm converged. |
Univariate ANOVA Table Based on the Usual Degrees of Freedom | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
Model | 8 | 79.33838 | 9.917298 | 213.09 | <.0001 |
Error | 160 | 7.44659 | 0.046541 | ||
Corrected Total | 168 | 86.78498 |
Root MSE | 0.21573 | R-Square | 0.9142 |
---|---|---|---|
Dependent Mean | 0.63130 | Adj R-Sq | 0.9099 |
Coeff Var | 34.17294 |
Univariate Regression Table Based on the Usual Degrees of Freedom | |||||||
---|---|---|---|---|---|---|---|
Variable | DF | Coefficient | Type II Sum of Squares |
Mean Square | F Value | Pr > F | Label |
Intercept | 1 | -15.274649 | 57.1338 | 57.1338 | 1227.60 | <.0001 | Intercept |
Pspline.EqRatio_1 | 1 | 35.102914 | 62.7478 | 62.7478 | 1348.22 | <.0001 | Equivalence Ratio 1 |
Pspline.EqRatio_2 | 1 | -19.386468 | 64.6430 | 64.6430 | 1388.94 | <.0001 | Equivalence Ratio 2 |
Identity(CpRatio) | 1 | 0.032058 | 1.4445 | 1.4445 | 31.04 | <.0001 | Compression Ratio |
Opscore(Fuel) | 5 | 0.158388 | 5.5619 | 1.1124 | 23.90 | <.0001 | Fuel |
The LOG transformation computes the natural log. The PSPLINE expansion expands EqRatio into a linear term, EqRatio, and a squared term, . An identity transformation of CpRatio and an optimal scoring of Fuel is requested. These should provide a good parametric operationalization of the optimal transformations. The final model has an R square of 0.91 (smaller than before since the model has fewer parameters, but still quite good).