The HPGENSELECT Procedure

Response Probability Distribution Functions

Binary Distribution

\begin{eqnarray*}  f(y) &  = &  \left\{  \begin{array}{ll} p & \mbox{for } y=1 \\ 1-p & \mbox{for } y=0 \\ \end{array} \right. \\ \mr {E}(Y) &  = &  p \\ \mr {Var}(Y) &  = &  p(1-p) \\ \end{eqnarray*}

Binomial Distribution

\begin{eqnarray*}  f(y) &  = &  {\left( \begin{array}{c}n \cr r\end{array}\right) } \mu ^ r (1-\mu )^{n-r}~ ~ ~  \mbox{for } y=\frac{r}{n}, ~  r=0,1, 2,\ldots ,n \\ \mr {E}(Y) &  = &  \mu \\ \mr {Var}(Y) &  = &  \frac{\mu (1-\mu )}{n} \\ \end{eqnarray*}

Gamma Distribution

\begin{eqnarray*}  f(y) &  = &  \frac{1}{\Gamma (\nu )y} \left( \frac{y\nu }{\mu } \right)^{\nu } \exp \left(-\frac{y \nu }{\mu } \right)~ ~ ~  \mbox{for } 0 < y < \infty \\ \phi &  = &  \frac{1}{\nu } \\ \mr {E}(Y) &  = &  \mu \\ \mr {Var}(Y) &  = &  \frac{\mu ^2}{\nu } \\ \end{eqnarray*}

For the gamma distribution, $\nu =\frac{1}{\phi }$ is the estimated dispersion parameter that is displayed in the output. The parameter $\nu $ is also sometimes called the gamma index parameter.

Inverse Gaussian Distribution

\begin{eqnarray*}  f(y) &  = &  \frac{1}{\sqrt {2\pi y^3} \sigma } \exp \left[ -\frac{1}{2y} \left( \frac{y-\mu }{\mu \sigma } \right)^2 \right]~ ~ ~  \mbox{for } 0 < y < \infty \\ \phi &  = &  \sigma ^2 \\ \mr {Var}(Y) &  = &  \phi \mu ^3 \\ \end{eqnarray*}

Multinomial Distribution

\begin{eqnarray*}  f(y_1, y_2,\cdots ,y_ k) &  = &  \frac{m!}{y_1! y_2! \cdots y_ k!}p_1^{y_1} p_2^{y_2} \cdots p_ k^{y_ k} \\ \end{eqnarray*}

Negative Binomial Distribution

\begin{eqnarray*}  f(y) &  = &  \frac{\Gamma (y+1/k)}{\Gamma (y+1)\Gamma (1/k)} \frac{(k\mu )^ y}{(1+k\mu )^{y+1/k}}~ ~ ~  \mbox{for } y = 0,1,2,\ldots \\ \phi &  = &  k \\ \mr {E}(Y) &  = &  \mu \\ \mr {Var}(Y) &  = &  \mu + \phi \mu ^2 \\ \end{eqnarray*}

For the negative binomial distribution, k is the estimated dispersion parameter that is displayed in the output.

Normal Distribution

\begin{eqnarray*}  f(y) &  = &  \frac{1}{\sqrt {2\pi } \sigma } \exp \left[ -\frac{1}{2} \left( \frac{y-\mu }{\sigma } \right)^2 \right]~ ~ ~  \mbox{for } -\infty < y < \infty \\ \phi &  = &  \sigma ^{2} \\ \mr {E}(Y) &  = &  \mu \\ \mr {Var}(Y) &  = &  \phi \\ \end{eqnarray*}

Poisson Distribution

\begin{eqnarray*}  f(y) &  = &  \frac{\mu ^ y \mr {e}^{-\mu }}{y!}~ ~ ~  \mbox{for } y = 0,1,2,\ldots \\ \mr {E}(Y) &  = &  \mu \\ \mr {Var}(Y) &  = &  \mu \\ \end{eqnarray*}

Tweedie Distribution

The Tweedie model is a generalized linear model from the exponential family. The Tweedie distribution is characterized by three parameters: the mean parameter $\mu $, the dispersion $\phi $, and the power p. The variance of the distribution is $\phi \mu ^ p$. For values of p in the range $1<p<2$, a Tweedie random variable can be represented as a Poisson sum of gamma distributed random variables. That is,

\[  Y = \sum _{i=1}^{N}Y_ i  \]

where N has a Poisson distribution that has mean $\lambda =\frac{\mu ^{2-p}}{\phi (2-p)}$ and the $Y_ i\mr {s}$ have independent, identical gamma distributions, each of which has an expected value $\mr {E}(Y_ i)=\phi (2-p)\mu ^{p-1}$ and an index parameter $\nu _ i=\frac{2-p}{p-1}$.

In this case, Y has a discrete mass at 0, $\mr {Pr}(Y=0)=\mr {Pr}(N=0)=\exp (-\lambda )$, and the probability density of Y $f(y)$ is represented by an infinite series for $y>0$. The HPGENSELECT procedure restricts the power parameter to satisfy $1.1<=p$ for numerical stability in model fitting. The Tweedie distribution does not have a general closed form representation for all values of p. It can be characterized in terms of the distribution mean parameter $\mu $, dispersion parameter $\phi $, and power parameter p. For more information about the Tweedie distribution, see Frees (2010).

The distribution mean and variance are given by:

\begin{eqnarray*}  \mr {E}(Y) &  = &  \mu \\ \mr {Var}(Y) &  = &  \phi \mu ^ p \\ \end{eqnarray*}

Zero-Inflated Negative Binomial Distribution

\begin{eqnarray*}  f(y) &  = &  \left\{  \begin{array}{ll} \omega + (1-\omega )(1+k\lambda )^{-\frac{1}{k}} & \mbox{for } y=0 \\ (1-\omega ) \frac{\Gamma (y+1/k)}{\Gamma (y+1)\Gamma (1/k)} \frac{(k\lambda )^ y}{(1+k\lambda )^{y+1/k}} & \mbox{for } y = 1,2,\ldots \\ \end{array} \right. \\ \phi &  = &  k \\ \mu = \mr {E}(Y) &  = &  (1-\omega )\lambda \\ \mr {Var}(Y) &  = &  (1-\omega )\lambda (1+\omega \lambda + k\lambda ) \\ &  = &  \mu + \left(\frac{\omega }{1-\omega }+\frac{k}{1-\omega }\right)\mu ^2 \\ \end{eqnarray*}

For the zero-inflated negative binomial distribution, k is the estimated dispersion parameter that is displayed in the output.

Zero-Inflated Poisson Distribution

\begin{eqnarray*}  f(y) &  = &  \left\{  \begin{array}{ll} \omega + (1-\omega )\mr {e}^{-\lambda } & \mbox{for } y=0 \\ (1-\omega )\frac{\lambda ^ y \mr {e}^{-\lambda }}{y!} & \mbox{for } y = 1,2,\ldots \\ \end{array} \right. \\ \mu = \mr {E}(Y) &  = &  (1-\omega )\lambda \\ \mr {Var}(Y) &  = &  (1-\omega )\lambda (1+\omega \lambda ) \\ &  = &  \mu + \frac{\omega }{1-\omega }\mu ^2 \\ \end{eqnarray*}