The ENTROPY Procedure(Experimental)

Generalized Maximum Entropy

Reparameterization of the errors in a regression equation is the process of specifying a support for the errors, observation by observation. If a two-point support is used, the error for the tth observation is reparameterized by setting $e_{t} \:  = \:  w_{t1} \,  v_{t1} \:  + \:  w_{t2} \,  v_{t2}$, where $v_{t1}$ and $v_{t2}$ are the upper and lower bounds for the tth error $e_{t}$, and $w_{t1}$ and $w_{t2}$ represent the weight associated with the point $v_{t1}$ and $v_{t2}$. The error distribution is usually chosen to be symmetric, centered around zero, and the same across observations so that $v_{t1} \:  = \:  -v_{t2} \:  = \:  R$, where R is the support value chosen for the problem (Golan, Judge, and Miller 1996).

The generalized maximum entropy (GME) formulation was proposed for the ill-posed or underdetermined case where there is insufficient data to estimate the model with traditional methods. $\beta $ is reparameterized by defining a support for $\beta $ (and a set of weights in the cross entropy case), which defines a prior distribution for $\beta $.

In the simplest case, each $\beta _ k$ is reparameterized as $\beta _ k \:  = \:  p_{k1} \,  z_{k1} \:  + \:  p_{k2} \,  z_{k2}$, where $p_{k1}$ and $p_{k2}$ represent the probabilities ranging from [0,1] for each $\beta $, and $z_{k1}$ and $z_{k2}$ represent the lower and upper bounds placed on $\beta _{k}$. The support points, $z_{k1}$ and $z_{k2}$, are usually distributed symmetrically around the most likely value for $\beta _{k}$ based on some prior knowledge.

With these reparameterizations, the GME estimation problem is

\begin{eqnarray*} \mr{maximize} & H(p,w) \: = \: -pā€™ \, \ln (p) \: - \: wā€™ \, \ln (w) \\ \mr{subject\, to} & y \: = \: X \, Z \, p \: + \: V \, w \\ & 1_{K} \: = \: (I_{K} \, \otimes \, 1_{L}ā€™) \, p \\ & 1_{T} \: = \: (I_{T} \, \otimes \, 1_{L}ā€™) \, w \end{eqnarray*}

where y denotes the column vector of length T of the dependent variable; $\mb{X}$ denotes the $(\mi{T} \times \mi{K} )$ matrix of observations of the independent variables; p denotes the LK column vector of weights associated with the points in Z; w denotes the LT column vector of weights associated with the points in V; $1_{K}$, $1_{L}$, and $1_{T}$ are K-, L-, and T-dimensional column vectors, respectively, of ones; and $I_{K}$ and $I_{T}$ are $(\mi{K} \times \mi{K} )$ and $(\mi{T} \times \mi{T} )$ dimensional identity matrices.

These equations can be rewritten using set notation as follows:

\[ \mr{maximize} \; \; \; H(p,w) \: = \: - \, \sum _{l=1}^{L} \, \sum _{k=1}^{K} \: p_{kl} \, \ln (p_{kl}) \: - \: \sum _{l=1}^{L} \, \sum _{t=1}^{T} \: w_{tl} \, \ln (w_{tl}) \]
\[ \mr{subject\, to} \; \; \; y_ t \: = \: \sum _{l=1}^{L} \left[ \, \sum _{k=1}^{K} \: \left( \, X_{kt} \, Z_{kl} \, p_{kl} \right) \: + \: V_{tl} \, w_{tl} \right] \]
\[ \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \sum _{l=1}^{L} \: p_{kl}\: = \: 1 \; \; \mr{and} \; \; \sum _{l=1}^{L} \: w_{tl} \: = \: 1 \]

The subscript l denotes the support point (l=1, 2, ..., L), k denotes the parameter (k=1, 2, ..., K), and t denotes the observation (t=1, 2, ..., T).

The GME objective is strictly concave; therefore, a unique solution exists. The optimal estimated probabilities, p and w, and the prior supports, Z and V, can be used to form the point estimates of the unknown parameters, $\beta $, and the unknown errors, e.