Nonlinear Optimization Examples


Example 14.3 Compartmental Analysis

Numerical Considerations

An important class of nonlinear models involves a dynamic description of the response rather than an explicit description. These models arise often in chemical kinetics, pharmacokinetics, and ecological compartmental modeling. Two examples are presented in this section. Refer to Bates and Watts (1988) for a more general introduction to the topic.

In this class of problems, function evaluations, as well as gradient evaluations, are not done in full precision. Evaluating a function involves the numerical solution of a differential equation with some prescribed precision. Therefore, two choices exist for evaluating first- and second-order derivatives:

  • differential equation approach

  • finite-difference approach

In the differential equation approach, the components of the Hessian and the gradient are written as a solution of a system of differential equations that can be solved simultaneously with the original system. However, the size of a system of differential equations, n, would suddenly increase to $n^2+2n$. This huge increase makes the finite difference approach an easier one.

With the finite-difference approach, a very delicate balance of all the precision requirements of every routine must exist. In the examples that follow, notice the relative levels of precision that are imposed on different modules. Since finite differences are used to compute the first- and second-order derivatives, it is incorrect to set the precision of the ODE solver at a coarse level because that would render the numerical estimation of the finite differences worthless.

A coarse computation of the solution of the differential equation cannot be accompanied by very fine computation of the finite-difference estimates of the gradient and the Hessian. That is, you cannot set the precision of the differential equation solver to be 1E–4 and perform the finite difference estimation with a precision of 1E–10.In addition, this precision must be well-balanced with the termination criteria imposed on the optimization process.

In general, if the precision of the function evaluation is $O(\epsilon )$, the gradient should be computed by finite differences $O(\sqrt {\epsilon })$, and the Hessian should be computed with finite differences $O(\epsilon ^{\frac{1}{3}})$. [1]

Diffusion of Tetracycline

Consider the concentration of tetracycline hydrochloride in blood serum. The tetracycline is administered to a subject orally, and the concentration of the tetracycline in the serum is measured. The biological system to be modeled consists of two compartments: a gut compartment in which tetracycline is injected and a blood compartment that absorbs the tetracycline from the gut compartment for delivery to the body. Let $\gamma _1(t)$ and $\gamma _2(t)$ be the concentrations at time t in the gut and the serum, respectively. Let $\theta _1$ and $\theta _2$ be the transfer parameters. The model is depicted as follows:

Output 14.3.1: Model of Diffusion

Model of Diffusion


The rates of flow of the drug are described by the following pair of ordinary differential equations:

\begin{eqnarray*}  \frac{d\gamma _1(t)}{dt} &  = &  -\theta _1 \gamma _1(t) \\ \frac{d\gamma _2(t)}{dt} &  = &  \theta _1 \gamma _1(t) - \theta _2 \gamma _2(t) \end{eqnarray*}

The initial concentration of the tetracycline in the gut is unknown, and while the concentration in the blood can be measured at all times, initially it is assumed to be zero. Therefore, for the differential equation, the initial conditions are given by

\begin{eqnarray*}  \gamma _1(0) &  = &  \theta _3 \\ \gamma _2(0) &  = &  0 \end{eqnarray*}

Also, a nonnegativity constraint is imposed on the parameters $\theta _1$, $\theta _2$, and $\theta _3$, although for numerical purposes, you might need to use a small value instead of zero for these bounds (such as 1E–7).

Suppose $y_ i$ is the observed serum concentration at time $t_ i$. The parameters are estimated by minimizing the sum of squares of the differences between the observed and predicted serum concentrations:

\[  \sum _ i \left(y_ i - \gamma _2(t_ i)\right)^2  \]

The following IML program illustrates how to combine the NLPDD subroutine and the ODE subroutine to estimate the parameters $(\theta _1,\theta _2,\theta _3)$ of this model. The input data are the measurement time and the concentration of the tetracycline in the blood. For more information about the ODE call, see the section ODE Call.

   data tetra;
      input t c @@;
      datalines;
    1 0.7   2 1.2   3 1.4   4 1.4   6 1.1
    8 0.8  10 0.6  12 0.5  16 0.3
   ;

   proc iml;
      use tetra;
      read all into tetra;
      start f(theta) global(thmtrx,t,h,tetra,eps);
         thmtrx = ( -theta[1] || 0 )     //
                  (  theta[1] || -theta[2] );
         c = theta[3]//0 ;
         t = 0 // tetra[,1];
         call ode( r1, "der",c , t, h) j="jac" eps=eps;
         f = ssq((r1[2,])`-tetra[,2]);
         return(f);
      finish;

      start der(t,x) global(thmtrx);
         return( thmtrx*x );
      finish;

      start jac(t,x) global(thmtrx);
         return( thmtrx );
      finish;

      h      = {1.e-14 1. 1.e-5};
      opt    = {0 2 0 1 };
      tc     = repeat(.,1,12);
      tc[1]  = 100;
      tc[7]  = 1.e-8;
      par    = { 1.e-13 . 1.e-10 . . . . };
      con    = j(1,3,0.);
      itheta = { .1  .3  10};
      eps    = 1.e-11;

      call nlpdd(rc,rx,"f",itheta) blc=con opt=opt tc=tc par=par;

The output from the optimization process is shown in Output 14.3.2.

Output 14.3.2: Printed Output for Tetracycline Diffusion Problem

Optimization Start
Parameter Estimates
N Parameter Estimate Gradient
Objective
Function
Lower
Bound
Constraint
Upper
Bound
Constraint
1 X1 0.100000 76.484452 0 .
2 X2 0.300000 -48.149258 0 .
3 X3 10.000000 1.675423 0 .


Value of Objective Function = 4.1469872322


Double Dogleg Optimization


Dual Broyden - Fletcher - Goldfarb - Shanno Update (DBFGS)


Without Parameter Scaling


Gradient Computed by Finite Differences

Parameter Estimates 3
Lower Bounds 3
Upper Bounds 0

Optimization Start
Active Constraints 0 Objective Function 4.1469872322
Max Abs Gradient Element 76.484451645 Radius 1

Iteration   Restarts Function
Calls
Active
Constraints
  Objective
Function
Objective
Function
Change
Max Abs
Gradient
Element
Lambda Slope of
Search
Direction
1   0 5 0   3.11868 1.0283 124.2 67.146 -8.014
2   0 6 0   0.89559 2.2231 14.1614 1.885 -5.015
3   0 7 0   0.32352 0.5721 3.7162 1.187 -0.787
4   0 8 0   0.14619 0.1773 2.6684 0 -0.123
5   0 9 0   0.07462 0.0716 2.3061 0 -0.0571
6   0 10 0   0.06549 0.00913 1.5874 0 -0.0075
7   0 11 0   0.06416 0.00132 1.0928 0 -0.0010
8   0 12 0   0.06334 0.000823 0.5649 0 -0.0006
9   0 13 0   0.06288 0.000464 0.1213 1.024 -0.0004
10   0 14 0   0.06279 0.000092 0.0195 0.321 -0.0001
11   0 15 0   0.06276 0.000024 0.0240 0 -199E-7
12   0 16 0   0.06275 0.000015 0.0195 0 -875E-8
13   0 17 0   0.06269 0.000055 0.0281 0.323 -366E-7
14   0 18 0   0.06248 0.000209 0.0474 0.283 -0.0001
15   0 19 0   0.06190 0.000579 0.1213 0.704 -0.0006
16   0 20 0   0.06135 0.000552 0.1844 0.314 -0.0004
17   0 21 0   0.05956 0.00179 0.3314 0.217 -0.0012
18   0 22 0   0.05460 0.00496 0.9879 0 -0.0036
19   0 23 0   0.04999 0.00461 1.4575 0 -0.0029
20   0 24 0   0.04402 0.00597 1.8484 0 -0.0067
21   0 25 0   0.04007 0.00395 0.1421 0 -0.0053
22   0 26 0   0.03865 0.00142 0.3127 0 -0.0008
23   0 27 0   0.03755 0.00110 0.8395 0 -0.0019
24   0 28 0   0.03649 0.00106 0.2732 0 -0.0010
25   0 29 0   0.03603 0.000464 0.1380 0 -0.0003
26   0 30 0   0.03580 0.000226 0.1191 0.669 -0.0003
27   0 31 0   0.03571 0.000090 0.0103 0 -581E-7
28   0 32 0   0.03565 0.000056 0.00786 0 -334E-7
29   0 34 0   0.03565 4.888E-6 0.00967 1.752 -486E-7
30   0 35 0   0.03565 6.841E-7 0.000493 0 -244E-7
31   0 36 0   0.03565 2.417E-7 0.00270 0 -57E-9
32   0 42 0   0.03565 1.842E-9 0.00179 2.431 -13E-9
33   0 49 0   0.03565 1.08E-11 0.00212 786.7 -35E-12

Optimization Results
Iterations 33 Function Calls 50
Gradient Calls 35 Active Constraints 0
Objective Function 0.0356478102 Max Abs Gradient Element 0.0021167366
Slope of Search Direction -3.52366E-11 Radius 1

GCONV convergence criterion satisfied.


Note: At least one element of the (projected) gradient is greater than 1e-3.

Optimization Results
Parameter Estimates
N Parameter Estimate Gradient
Objective
Function
1 X1 0.182491 0.002117
2 X2 0.435865 -0.000501
3 X3 6.018219 0.000046711


Value of Objective Function = 0.0356478102



The differential equation model is linear, and in fact, it can be solved by using an eigenvalue decomposition (this is not always feasible without complex arithmetic). Alternately, the availability and the simplicity of the closed form representation of the solution enable you to replace the solution produced by the ODE routine with the simpler and faster analytical solution. Closed forms are not expected to be easily available for nonlinear systems of differential equations, which is why the preceding solution was introduced.

The closed form of the solution requires a change to the function $f(\cdot )$. The functions needed as arguments of the ODE routine, namely the der and jac modules, can be removed. Here is the revised code:

   start f(th) global(theta,tetra);
      theta = th;
      vv    = v(tetra[,1]);
      error = ssq(vv-tetra[,2]);
      return(error);
   finish;

   start v(t) global(theta);
      v = theta[3]*theta[1]/(theta[2]-theta[1])*
              (exp(-theta[1]*t)-exp(-theta[2]*t));
      return(v);
   finish;

   call nlpdd(rc,rx,"f",itheta) blc=con opt=opt tc=tc par=par;

The parameter estimates, which are shown in Output 14.3.3, are close to those obtained by the first solution.

Output 14.3.3: Second Set of Parameter Estimates for Tetracycline Diffusion

Optimization Results
Parameter Estimates
N Parameter Estimate Gradient
Objective
Function
1 X1 0.183024 0.000000111
2 X2 0.434484 -0.000000115
3 X3 5.995273 9.3860868E-9



Because of the nature of the closed form of the solution, you might want to add an additional constraint to guarantee that $\theta _2 \ne \theta _1$ at any time during the optimization. This prevents a possible division by 0 or a value near 0 in the execution of the $v(\cdot )$ function. For example, you might add the constraint

\begin{eqnarray*}  \theta _2-\theta _1 \ge 10^{-7} \end{eqnarray*}

Chemical Kinetics of Pyrolysis of Oil Shale

Pyrolysis is a chemical change effected by the action of heat, and this example considers the pyrolysis of oil shale described in Ziegel and Gorman (1980). Oil shale contains organic material that is bonded to the rock. To extract oil from the rock, heat is applied, and the organic material is decomposed into oil, bitumen, and other byproducts. The model is given by

\begin{eqnarray*}  \frac{d\gamma _1(t)}{dt} &  = &  -(\theta _1+\theta _4) \gamma _1(t) \iota (t,\theta _5)\\ \frac{d\gamma _2(t)}{dt} &  = &  [\theta _1 \gamma _1(t) - (\theta _2 + \theta _3) \gamma _2(t)] \iota (t,\theta _5) \\ \frac{d\gamma _3(t)}{dt} &  = &  [\theta _4 \gamma _1(t) + \theta _2 \gamma _2(t)] \iota (t,\theta _5) \end{eqnarray*}

with the initial conditions

\[  \gamma _1(t) = 100 , \quad \gamma _2(t) = 0 , \quad \gamma _3(t) = 0  \]

A dead time is assumed to exist in the process. That is, no change occurs up to time $\theta _5$. This is controlled by the indicator function $\iota (t,\theta _5)$, which is given by

\[  \iota (t,\theta _5) = \left\{  \begin{array}{ll} 0 &  \mbox{if } t < \theta _5 \\ 1 &  \mbox{if } t \ge \theta _5 \end{array} \right.  \]

where $\theta _5 \ge 0$. Only one of the cases in Ziegel and Gorman (1980) is analyzed in this report, but the others can be handled in a similar manner. The following IML program illustrates how to combine the NLPQN subroutine and the ODE subroutine to estimate the parameters $\theta _ i$ in this model:

   data oil ( drop=temp);
      input temp time bitumen oil;
      datalines;
   673     5      0.      0.
   673     7      2.2     0.
   673    10     11.5     0.7
   673    15     13.7     7.2
   673    20     15.1    11.5
   673    25     17.3    15.8
   673    30     17.3    20.9
   673    40     20.1    26.6
   673    50     20.1    32.4
   673    60     22.3    38.1
   673    80     20.9    43.2
   673   100     11.5    49.6
   673   120      6.5    51.8
   673   150      3.6    54.7
   ;

   proc iml;
      use oil;
      read all into a;

   /****************************************************************/
   /* The INS function inserts a value given by FROM into a vector */
   /* given by INTO, sorts the result, and posts the global matrix */
   /* that can be used to delete the effects of the point FROM.    */
   /****************************************************************/
      start ins(from,into) global(permm);
         in    =  into // from;
         x     =  i(nrow(in));
         permm = inv(x[rank(in),]);
         return(permm*in);
      finish;

      start der(t,x) global(thmtrx,thet);
         y     = thmtrx*x;
         if ( t <= thet[5] )  then y = 0*y;
         return(y);
      finish;

      start jac(t,x) global(thmtrx,thet);
         y     = thmtrx;
         if ( t <= thet[5] )  then y = 0*y;
         return(y);
      finish;

      start f(theta) global(thmtrx,thet,time,h,a,eps,permm);
         thet = theta;
         thmtrx = (-(theta[1]+theta[4]) ||         0            || 0 )//
                  (theta[1]             || -(theta[2]+theta[3]) || 0 )//
                  (theta[4]             || theta[2]             || 0 );
         t = ins( theta[5],time);
         c = { 100, 0, 0};
         call ode( r1, "der",c , t , h) j="jac" eps=eps;

      /* send the intermediate value to the last column */
         r = (c ||r1) * permm;
         m = r[2:3,(2:nrow(time))];
         mm = m`- a[,2:3];
         call qr(q,r,piv,lindep,mm);
         v = det(r);
         return(abs(v));
      finish;

      opt = {0 2 0 1 };
      tc = repeat(.,1,12);
      tc[1] = 100;
      tc[7] = 1.e-7;
      par = { 1.e-13 . 1.e-10 . . . .};
      con = j(1,5,0.);
      h = {1.e-14 1. 1.e-5};
      time = (0 // a[,1]);
      eps = 1.e-5;
      itheta = { 1.e-3 1.e-3 1.e-3 1.e-3 1.};

      call nlpqn(rc,rx,"f",itheta)  blc=con opt=opt tc=tc par=par;

The parameter estimates are shown in Output 14.3.4.

Output 14.3.4: Parameter Estimates for Oil Shale Pyrolysis

Optimization Results
Parameter Estimates
N Parameter Estimate Gradient
Objective
Function
1 X1 0.023086 1282509
2 X2 0.008127 696102
3 X3 0.013701 722960
4 X4 0.011014 592104
5 X5 1.000012 -8123.799232



Again, compare the solution using the approximation produced by the ODE subroutine to the solution obtained through the closed form of the given differential equation. Impose the following additional constraint to avoid a possible division by 0 when evaluating the function:

\begin{eqnarray*}  \theta _2 + \theta _3 - \theta _1 - \theta _4 \ge 10^{-7} \end{eqnarray*}

The closed form of the solution requires a change in the function $f(\cdot )$. The functions needed as arguments of the ODE routine, namely the der and jac modules, can be removed. Here is the revised code:

   start f(thet) global(time,a);
      do i = 1 to nrow(time);
         t   = time[i];
         v1  = 100;
         if ( t >= thet[5] ) then
            v1 = 100*ev(t,thet[1],thet[4],thet[5]);
         v2 = 0;
         if ( t >= thet[5] ) then
            v2 = 100*thet[1]/(thet[2]+thet[3]-thet[1]-thet[4])*
                  (ev(t,thet[1],thet[4],thet[5])-
                   ev(t,thet[2],thet[3],thet[5]));
         v3 = 0;
         if ( t >= thet[5] ) then
            v3  = 100*thet[4]/(thet[1]+thet[4])*
              (1. - ev(t,thet[1],thet[4],thet[5])) +
              100*thet[1]*thet[2]/(thet[2]+thet[3]-thet[1]-thet[4])*(
              (1.-ev(t,thet[1],thet[4],thet[5]))/(thet[1]+thet[4]) -
              (1.-ev(t,thet[2],thet[3],thet[5]))/(thet[2]+thet[3])    );
         y = y // (v1 || v2 || v3);
      end;
      mm = y[,2:3]-a[,2:3];
      call qr(q,r,piv,lindep,mm);
      v = det(r);
      return(abs(v));
   finish;

   start ev(t,a,b,c);
      return(exp(-(a+b)*(t-c)));
   finish;

   con     = { 0.  0.  0.  0.   .   .  . ,
                .   .   .   .   .    . . ,
               -1   1   1  -1   .   1  1.e-7 };
   time    =  a[,1];
   par     = { 1.e-13 . 1.e-10 . . . .};
   itheta  = { 1.e-3 1.e-3 1.e-2 1.e-3 1.};

   call nlpqn(rc,rx,"f",itheta)  blc=con opt=opt tc=tc par=par;

The parameter estimates are shown in Output 14.3.5.

Output 14.3.5: Second Set of Parameter Estimates for Oil Shale Pyrolysis

Optimization Results
Parameter Estimates
N Parameter Estimate Gradient
Objective
Function
1 X1 0.017178 -0.030690
2 X2 0.008912 0.070424
3 X3 0.020007 -0.010621
4 X4 0.010494 0.206102
5 X5 7.771458 -0.000062272





[1] In Release 6.09 and in later releases, you can specify the step size h in the finite-difference formulas.