Suppose you are interested in fitting a model that consists of two segments that connect in a smooth fashion. For example, the following model states that the mean of Y is a quadratic function in x for values of x less than and that the mean of Y is constant for values of x greater than :
In this model equation, , , and are the coefficients of the quadratic segment, and c is the plateau of the mean function. The HPNLMOD procedure can fit such a segmented model even when the join point, , is unknown.
Suppose you also want to impose conditions on the two segments of the model. First, the curve should be continuous—that is, the quadratic and the plateau section need to meet at . Second, the curve should be smooth—that is, the first derivative of the two segments with respect to x needs to coincide at .
The continuity condition requires that
The smoothness condition requires that
If you solve for and substitute into the expression for c, the two conditions jointly imply that
Although there are five unknowns, the model contains only three independent parameters. The continuity and smoothness restrictions together completely determine two parameters, given the other three.
The following DATA step creates the SAS data set for this example:
data a; input y x @@; datalines; .46 1 .47 2 .57 3 .61 4 .62 5 .68 6 .69 7 .78 8 .70 9 .74 10 .77 11 .78 12 .74 13 .80 13 .80 15 .78 16 ;
The following PROC HPNLMOD statements fit this segmented model:
proc hpnlmod data=a out=b; parms alpha=.45 beta=.05 gamma=-.0025; x0 = -.5*beta / gamma; if (x < x0) then yp = alpha + beta*x + gamma*x*x; else yp = alpha + beta*x0 + gamma*x0*x0; model y ~ residual(yp); estimate 'join point' -beta/2/gamma; estimate 'plateau value c' alpha - beta**2/(4*gamma); predict 'predicted' yp pred=yp; predict 'response' y pred=y; predict 'x' x pred=x; run;
The parameters of the model are , , and . They are represented in the PROC HPNLMOD statements by the variables alpha
, beta
, and gamma
, respectively. In order to model the two segments, a conditional statement assigns the appropriate expression to the mean
function, depending on the value of . The ESTIMATE statements compute the values of and c. The PREDICT
statement computes predicted values for plotting and saves them to data set b
.
The results from fitting this model are shown in Output 10.1.1 through Output 10.1.3. The iterative optimization converges after six iterations (Output 10.1.1). Output 10.1.2 shows the estimated parameters. Output 10.1.3 indicates that the join point is 12.7477 and the plateau value is 0.7775.
Output 10.1.1: Nonlinear Least Squares Iterative Phase
Quadratic Model with Plateau |
Iteration History | ||||
---|---|---|---|---|
Iteration | Evaluations | Objective Function |
Change | Max Gradient |
0 | 5 | 0.0035144531 | 7.184063 | |
1 | 2 | 0.0007352716 | 0.00277918 | 2.145337 |
2 | 2 | 0.0006292751 | 0.00010600 | 0.032551 |
3 | 2 | 0.0006291261 | 0.00000015 | 0.002952 |
4 | 2 | 0.0006291244 | 0.00000000 | 0.000238 |
5 | 2 | 0.0006291244 | 0.00000000 | 0.000023 |
6 | 2 | 0.0006291244 | 0.00000000 | 2.313E-6 |
The following statements produce a graph of the observed and predicted values along with reference lines for the join point and plateau estimates (Output 10.1.4):
proc sgplot data=b noautolegend; yaxis label='Observed or Predicted'; refline 0.7775 / axis=y label="Plateau" labelpos=min; refline 12.7477 / axis=x label="Join point" labelpos=min; scatter y=y x=x; series y=yp x=x; run;