Nonlinear Optimization Examples


Example 14.7 A Two-Equation Maximum Likelihood Problem

The following example and notation are taken from Bard (1974). A two-equation model is used to fit U.S. production data for the years 1909–1949, where $z_1$ is capital input, $z_2$ is labor input, $z_3$ is real output, $z_4$ is time in years (with 1929 as the origin), and $z_5$ is the ratio of price of capital services to wage scale. The data can be entered by using the following statements:

   proc iml;
      z={ 1.33135 0.64629 0.4026 -20 0.24447,
          1.39235 0.66302 0.4084 -19 0.23454,
          1.41640 0.65272 0.4223 -18 0.23206,
          1.48773 0.67318 0.4389 -17 0.22291,
          1.51015 0.67720 0.4605 -16 0.22487,
          1.43385 0.65175 0.4445 -15 0.21879,
          1.48188 0.65570 0.4387 -14 0.23203,
          1.67115 0.71417 0.4999 -13 0.23828,
          1.71327 0.77524 0.5264 -12 0.26571,
          1.76412 0.79465 0.5793 -11 0.23410,
          1.76869 0.71607 0.5492 -10 0.22181,
          1.80776 0.70068 0.5052  -9 0.18157,
          1.54947 0.60764 0.4679  -8 0.22931,
          1.66933 0.67041 0.5283  -7 0.20595,
          1.93377 0.74091 0.5994  -6 0.19472,
          1.95460 0.71336 0.5964  -5 0.17981,
          2.11198 0.75159 0.6554  -4 0.18010,
          2.26266 0.78838 0.6851  -3 0.16933,
          2.33228 0.79600 0.6933  -2 0.16279,
          2.43980 0.80788 0.7061  -1 0.16906,
          2.58714 0.84547 0.7567   0 0.16239,
          2.54865 0.77232 0.6796   1 0.16103,
          2.26042 0.67880 0.6136   2 0.14456,
          1.91974 0.58529 0.5145   3 0.20079,
          1.80000 0.58065 0.5046   4 0.18307,
          1.86020 0.62007 0.5711   5 0.18352,
          1.88201 0.65575 0.6184   6 0.18847,
          1.97018 0.72433 0.7113   7 0.20415,
          2.08232 0.76838 0.7461   8 0.18847,
          1.94062 0.69806 0.6981   9 0.17800,
          1.98646 0.74679 0.7722  10 0.19979,
          2.07987 0.79083 0.8557  11 0.21115,
          2.28232 0.88462 0.9925  12 0.23453,
          2.52779 0.95750 1.0877  13 0.20937,
          2.62747 1.00285 1.1834  14 0.19843,
          2.61235 0.99329 1.2565  15 0.18898,
          2.52320 0.94857 1.2293  16 0.17203,
          2.44632 0.97853 1.1889  17 0.18140,
          2.56478 1.02591 1.2249  18 0.19431,
          2.64588 1.03760 1.2669  19 0.19492,
          2.69105 0.99669 1.2708  20 0.17912 };

The two-equation model in five parameters $c_1,\ldots ,c_5$ is

\begin{eqnarray*}  g_1 &  = &  c_110^{c_2z_4}[c_5z_1^{-c_4} + (1-c_5)z_2^{-c_4}]^{-c_3/c_4} - z_3 = 0 \\ g_2 &  = &  [\frac{c_5}{1-c_5}] \left( \frac{z_1}{z_2} \right)^{-1-c_4} -z_5 = 0 \end{eqnarray*}

where the variables $z_1$ and $z_2$ are considered dependent (endogenous) and the variables $z_3$, $z_4$, and $z_5$ are considered independent (exogenous).

Differentiation of the two equations $g_1$ and $g_2$ with respect to the endogenous variables $z_1$ and $z_2$ yields the Jacobian matrix $\partial g_ i / \partial z_ j$ for $i=1,2$ and $j=1,2$, where i corresponds to rows (equations) and j corresponds to endogenous variables (refer to Bard (1974)). You must consider parameter sets for which the elements of the Jacobian and the logarithm of the determinant cannot be computed. In such cases, the function module must return a missing value. Here is the code:

   start fiml(pr) global(z);
      c1 = pr[1]; c2 = pr[2]; c3 = pr[3]; c4 = pr[4]; c5 = pr[5];
     /* 1. Compute Jacobian */
      lndet = 0 ;
      do t= 1 to 41;
         j11 = (-c3/c4) * c1 * 10 ##(c2 * z[t,4]) * (-c4) * c5 *
               z[t,1]##(-c4-1) * (c5 * z[t,1]##(-c4) + (1-c5) *
               z[t,2]##(-c4))##(-c3/c4 -1);
         j12 = (-c3/c4) * (-c4) * c1 * 10 ##(c2 * z[t,4]) * (1-c5) *
               z[t,2]##(-c4-1) * (c5 * z[t,1]##(-c4) + (1-c5) *
               z[t,2]##(-c4))##(-c3/c4 -1);
         j21 = (-1-c4)*(c5/(1-c5))*z[t,1]##( -2-c4)/ (z[t,2]##(-1-c4));
         j22 = (1+c4)*(c5/(1-c5))*z[t,1]##( -1-c4)/ (z[t,2]##(-c4));

         j = (j11 || j12 ) // (j21 || j22) ;
         if any(j = .) then detj = 0.;
            else detj = det(j);
         if abs(detj) < 1.e-30 then do;
            print t detj j;
            return(.);
         end;
         lndet = lndet + log(abs(detj));
      end;

Assuming that the residuals of the two equations are normally distributed, the likelihood is then computed as in Bard (1974). The following code computes the logarithm of the likelihood function:

     /* 2. Compute Sigma */
      sb = j(2,2,0.);
      do t= 1 to 41;
         eq_g1 = c1 * 10##(c2 * z[t,4]) * (c5*z[t,1]##(-c4)
                 + (1-c5)*z[t,2]##(-c4))##(-c3/c4) - z[t,3];
         eq_g2 = (c5/(1-c5)) * (z[t,1] / z[t,2])##(-1-c4) - z[t,5];
         resid = eq_g1 // eq_g2;
         sb = sb + resid * resid`;
      end;
      sb = sb / 41;
     /* 3. Compute log L */
      const = 41. * (log(2 * 3.1415) + 1.);
      lnds = 0.5 * 41 * log(det(sb));
      logl = const - lndet + lnds;
      return(logl);
   finish fiml;

There are potential problems in computing the power and log functions for an unrestricted parameter set. As a result, optimization algorithms that use line search fail more often than algorithms that restrict the search area. For that reason, the NLPDD subroutine is used in the following code to maximize the log-likelihood function:

   pr = j(1,5,0.001);
   optn = {0 2};
   tc = {. . . 0};
   call nlpdd(rc, xr,"fiml", pr, optn,,tc);
   print "Start" pr, "RC=" rc, "Opt Par" xr;

Part of the iteration history is shown in Output 14.7.1.

Output 14.7.1: Iteration History for Two-Equation ML Problem

Optimization Start
Active Constraints 0 Objective Function 909.72691311
Max Abs Gradient Element 41115.729089 Radius 1

Iteration   Restarts Function
Calls
Active
Constraints
  Objective
Function
Objective
Function
Change
Max Abs
Gradient
Element
Lambda Slope of
Search
Direction
1   0 2 0   85.24836 824.5 3676.4 711.8 -71032
2   0 7 0   45.14682 40.1015 3382.0 2881.2 -29.683
3   0 10 0   43.46797 1.6788 208.4 95.020 -3.348
4   0 17 0   -32.75897 76.2269 1067.1 0.487 -39.170
5   0 20 0   -56.10954 23.3506 2152.7 1.998 -44.198
6   0 24 0   -56.95094 0.8414 1210.7 32.319 -1.661
7   0 25 0   -57.62221 0.6713 1232.8 2.411 -0.676
8   0 26 0   -58.44202 0.8198 1495.4 1.988 -0.813
9   0 27 0   -59.49594 1.0539 1218.7 1.975 -1.048
10   0 28 0   -61.01458 1.5186 1471.5 1.949 -1.518
11   0 31 0   -67.84243 6.8279 1084.9 1.068 -14.681
12   0 33 0   -69.86837 2.0259 2082.9 1.676 -7.128
13   0 34 0   -72.09161 2.2232 1066.7 0.693 -2.150
14   0 35 0   -73.68548 1.5939 731.5 0 -1.253
15   0 36 0   -75.52558 1.8401 712.3 0 -1.007
16   0 38 0   -89.36440 13.8388 5158.5 0 -9.584
17   0 39 0   -100.86235 11.4979 1196.1 0.615 -11.029
18   0 40 0   -104.82074 3.9584 3196.1 0.439 -7.469
19   0 41 0   -108.52445 3.7037 615.0 0 -2.828
20   0 42 0   -110.16227 1.6378 937.6 0.279 -1.436
21   0 43 0   -110.69116 0.5289 1210.6 0.242 -0.692
22   0 44 0   -110.74189 0.0507 406.1 0 -0.0598
23   0 45 0   -110.75511 0.0132 188.3 0 -0.0087
24   0 46 0   -110.76825 0.0131 19.1583 0.425 -0.0110
25   0 47 0   -110.77808 0.00983 62.9598 0 -0.0090
26   0 48 0   -110.77855 0.000461 27.8623 0 -0.0004
27   0 49 0   -110.77858 0.000036 1.1824 0 -341E-7
28   0 50 0   -110.77858 4.357E-7 0.0298 0 -326E-9
29   0 51 0   -110.77858 2.019E-8 0.0408 0 -91E-10
30   0 52 0   -110.77858 8.25E-10 0.000315 0 -61E-12
31   0 55 0   -110.77858 1.11E-12 0.000138 1.950 -55E-15
32   0 56 0   -110.77858 1.11E-12 0.000656 1.997 -15E-14
32   0 57 0   -110.77858 0 0.000656 1.170 -2E-13

Optimization Results
Iterations 32 Function Calls 58
Gradient Calls 34 Active Constraints 0
Objective Function -110.7785811 Max Abs Gradient Element 0.0006560833
Slope of Search Direction -2.03416E-13 Radius 8.4959355E-9



The results are very close to those reported by Bard (1974). Bard also reports different approaches to the same problem that can lead to very different MLEs.

Output 14.7.2: Parameter Estimates

Optimization Results
Parameter Estimates
N Parameter Estimate Gradient
Objective
Function
1 X1 0.583884 0.000028901
2 X2 0.005882 0.000656
3 X3 1.362817 0.000008072
4 X4 0.475091 -0.000023275
5 X5 0.447072 0.000153