The ENTROPY Procedure (Experimental)

Generalized Maximum Entropy

Reparameterization of the errors in a regression equation is the process of specifying a support for the errors, observation by observation. If a two-point support is used, the error for the tth observation is reparameterized by setting $e_{t} \: = \: w_{t1} \, v_{t1} \: + \: w_{t2} \, v_{t2}$ , where $v_{t1}$ and $v_{t2}$ are the upper and lower bounds for the tth error $e_{t}$ , and $w_{t1}$ and $w_{t2}$ represent the weight associated with the point $v_{t1}$ and $v_{t2}$ . The error distribution is usually chosen to be symmetric, centered around zero, and the same across observations so that $v_{t1} \: = \: -v_{t2} \: = \: R$ , where R is the support value chosen for the problem (Golan, Judge, and Miller, 1996).

The generalized maximum entropy (GME) formulation was proposed for the ill-posed or underdetermined case where there is insufficient data to estimate the model with traditional methods. $\beta$ is reparameterized by defining a support for $\beta$ (and a set of weights in the cross entropy case), which defines a prior distribution for $\beta$ .

In the simplest case, each $\beta _ k$ is reparameterized as $\beta _ k \: = \: p_{k1} \, z_{k1} \: + \: p_{k2} \, z_{k2}$ , where $p_{k1}$ and $p_{k2}$ represent the probabilities ranging from [0,1] for each $\beta$ , and $z_{k1}$ and $z_{k2}$ represent the lower and upper bounds placed on $\beta _{k}$ . The support points, $z_{k1}$ and $z_{k2}$ , are usually distributed symmetrically around the most likely value for $\beta _{k}$ based on some prior knowledge.

With these reparameterizations, the GME estimation problem is

$\displaystyle \mr {maximize}$	$\displaystyle H(p,w) \: = \: -p’ \, \ln (p) \: - \: w’ \, \ln (w)$
$\displaystyle \mr {subject\, to}$	$\displaystyle y \: = \: X \, Z \, p \: + \: V \, w$
$\displaystyle$	$\displaystyle 1_{K} \: = \: (I_{K} \, \otimes \, 1_{L}’) \, p$
$\displaystyle$	$\displaystyle 1_{T} \: = \: (I_{T} \, \otimes \, 1_{L}’) \, w$

where y denotes the column vector of length T of the dependent variable; $\mb {X}$ denotes the $(\mi {T} \times \mi {K} )$ matrix of observations of the independent variables; p denotes the LK column vector of weights associated with the points in Z; w denotes the LT column vector of weights associated with the points in V; $1_{K}$ , $1_{L}$ , and $1_{T}$ are K-, L-, and T-dimensional column vectors, respectively, of ones; and $I_{K}$ and $I_{T}$ are $(\mi {K} \times \mi {K} )$ and $(\mi {T} \times \mi {T} )$ dimensional identity matrices.

These equations can be rewritten using set notation as follows:

$\mr {maximize} \; \; \; H(p,w) \: = \: - \, \sum _{l=1}^{L} \, \sum _{k=1}^{K} \: p_{kl} \, \ln (p_{kl}) \: - \: \sum _{l=1}^{L} \, \sum _{t=1}^{T} \: w_{tl} \, \ln (w_{tl})$

$\mr {subject\, to} \; \; \; y_ t \: = \: \sum _{l=1}^{L} \left[ \, \sum _{k=1}^{K} \: \left( \, X_{kt} \, Z_{kl} \, p_{kl} \right) \: + \: V_{tl} \, w_{tl} \right]$

$\; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \sum _{l=1}^{L} \: p_{kl}\: = \: 1 \; \; \mr {and} \; \; \sum _{l=1}^{L} \: w_{tl} \: = \: 1$

The subscript l denotes the support point (l=1, 2, ..., L), k denotes the parameter (k=1, 2, ..., K), and t denotes the observation (t=1, 2, ..., T).

The GME objective is strictly concave; therefore, a unique solution exists. The optimal estimated probabilities, p and w, and the prior supports, Z and V, can be used to form the point estimates of the unknown parameters, $\beta$ , and the unknown errors, e.