The ENTROPY Procedure |

Generalized Maximum Entropy |

Reparameterization of the errors in a regression equation is the process of specifying a support for the errors, observation by observation. If a two-point support is used, the error for the *t*th observation is reparameterized by setting , where and are the upper and lower bounds for the *t*th error , and and represent the weight associated with the point and . The error distribution is usually chosen to be symmetric, centered around zero, and the same across observations so that , where *R* is the support value chosen for the problem (Golan, Judge, and Miller; 1996).

The generalized maximum entropy (GME) formulation was proposed for the ill-posed or underdetermined case where there is insufficient data to estimate the model with traditional methods. is reparameterized by defining a support for (and a set of weights in the cross entropy case), which defines a prior distribution for .

In the simplest case, each is reparameterized as , where and represent the probabilities ranging from [0,1] for each , and and represent the lower and upper bounds placed on . The support points, and , are usually distributed symmetrically around the most likely value for based on some prior knowledge.

With these reparameterizations, the GME estimation problem is

where *y* denotes the column vector of length *T* of the dependent variable; denotes the matrix of observations of the independent variables; *p* denotes the *LK* column vector of weights associated with the points in *Z*; *w* denotes the *LT* column vector of weights associated with the points in *V*; , , and are *K*-, *L*-, and *T*-dimensional column vectors, respectively, of ones; and and are and dimensional identity matrices.

These equations can be rewritten using set notation as follows:

The subscript *l* denotes the support point (l=1, 2, ..., L), *k* denotes the parameter (k=1, 2, ..., K), and *t* denotes the observation (t=1, 2, ..., T).

The GME objective is strictly concave; therefore, a unique solution exists. The optimal estimated probabilities, *p* and *w*, and the prior supports, *Z* and *V*, can be used to form the point estimates of the unknown parameters, , and the unknown errors, *e*.

Note: This procedure is experimental.

Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.