The ENTROPY Procedure (Experimental)

Generalized Maximum Entropy

Reparameterization of the errors in a regression equation is the process of specifying a support for the errors, observation by observation. If a two-point support is used, the error for the tth observation is reparameterized by setting $e_{t} \:  = \:  w_{t1} \,  v_{t1} \:  + \:  w_{t2} \,  v_{t2}$, where $v_{t1}$ and $v_{t2}$ are the upper and lower bounds for the tth error $e_{t}$, and $w_{t1}$ and $w_{t2}$ represent the weight associated with the point $v_{t1}$ and $v_{t2}$. The error distribution is usually chosen to be symmetric, centered around zero, and the same across observations so that $v_{t1} \:  = \:  -v_{t2} \:  = \:  R$, where R is the support value chosen for the problem (Golan, Judge, and Miller, 1996).

The generalized maximum entropy (GME) formulation was proposed for the ill-posed or underdetermined case where there is insufficient data to estimate the model with traditional methods. $\beta $ is reparameterized by defining a support for $\beta $ (and a set of weights in the cross entropy case), which defines a prior distribution for $\beta $.

In the simplest case, each $\beta _ k$ is reparameterized as $\beta _ k \:  = \:  p_{k1} \,  z_{k1} \:  + \:  p_{k2} \,  z_{k2}$, where $p_{k1}$ and $p_{k2}$ represent the probabilities ranging from [0,1] for each $\beta $, and $z_{k1}$ and $z_{k2}$ represent the lower and upper bounds placed on $\beta _{k}$. The support points, $z_{k1}$ and $z_{k2}$, are usually distributed symmetrically around the most likely value for $\beta _{k}$ based on some prior knowledge.

With these reparameterizations, the GME estimation problem is

$\displaystyle  \mr {maximize}  $
$\displaystyle  H(p,w) \:  = \:  -p’ \,  \ln (p) \:  - \:  w’ \,  \ln (w)  $
$\displaystyle \mr {subject\,  to}  $
$\displaystyle  y \:  = \:  X \,  Z \,  p \:  + \:  V \,  w  $
$\displaystyle  $
$\displaystyle  1_{K} \:  = \:  (I_{K} \,  \otimes \,  1_{L}’) \,  p  $
$\displaystyle  $
$\displaystyle  1_{T} \:  = \:  (I_{T} \,  \otimes \,  1_{L}’) \,  w  $

where y denotes the column vector of length T of the dependent variable; $\mb {X}$ denotes the $(\mi {T} \times \mi {K} )$ matrix of observations of the independent variables; p denotes the LK column vector of weights associated with the points in Z; w denotes the LT column vector of weights associated with the points in V; $1_{K}$, $1_{L}$, and $1_{T}$ are K-, L-, and T-dimensional column vectors, respectively, of ones; and $I_{K}$ and $I_{T}$ are $(\mi {K} \times \mi {K} )$ and $(\mi {T} \times \mi {T} )$ dimensional identity matrices.

These equations can be rewritten using set notation as follows:

\[  \mr {maximize} \;  \;  \;  H(p,w) \:  = \:  - \,  \sum _{l=1}^{L} \,  \sum _{k=1}^{K} \:  p_{kl} \,  \ln (p_{kl}) \:  - \:  \sum _{l=1}^{L} \,  \sum _{t=1}^{T} \:  w_{tl} \,  \ln (w_{tl})  \]
\[  \mr {subject\,  to} \;  \;  \;  y_ t \:  = \:  \sum _{l=1}^{L} \left[ \,  \sum _{k=1}^{K} \:  \left( \,  X_{kt} \,  Z_{kl} \,  p_{kl} \right) \:  + \:  V_{tl} \,  w_{tl} \right]  \]
\[  \;  \;  \;  \;  \;  \;  \;  \;  \;  \;  \;  \;  \;  \;  \;  \;  \;  \;  \sum _{l=1}^{L} \:  p_{kl}\:  = \:  1 \;  \;  \mr {and} \;  \;  \sum _{l=1}^{L} \:  w_{tl} \:  = \:  1  \]

The subscript l denotes the support point (l=1, 2, ..., L), k denotes the parameter (k=1, 2, ..., K), and t denotes the observation (t=1, 2, ..., T).

The GME objective is strictly concave; therefore, a unique solution exists. The optimal estimated probabilities, p and w, and the prior supports, Z and V, can be used to form the point estimates of the unknown parameters, $\beta $, and the unknown errors, e.