The Four Types of Estimable Functions


Estimability

Given a response or dependent variable $\mb{Y}$, predictors or independent variables $\mb{X}$, and a linear expectation model $\mr{E}[\mb{Y}]=\mb{X} \bbeta $ relating the two, a primary analytical goal is to estimate or test for the significance of certain linear combinations of the elements of $\bbeta $. For least squares regression and analysis of variance, this is accomplished by computing linear combinations of the observed $\mb{Y}$s. An unbiased linear estimate of a specific linear function of the individual $\beta $s, say $\mb{L} \bbeta $, is a linear combination of the $\mb{Y}$s that has an expected value of $\mb{L} \bbeta $. Hence, the following definition:

A linear combination of the parameters $\mb{L} \bbeta $ is estimable if and only if a linear combination of the $\mb{Y}$s exists that has expected value $\mb{L} \bbeta $.

Any linear combination of the $\mb{Y}$s, for instance $\mb{KY}$, will have expectation $\mr{E}[\mb{KY}]=\mb{KX} \bbeta $. Thus, the expected value of any linear combination of the $\mb{Y}$s is equal to that same linear combination of the rows of $\mb{X}$ multiplied by $\bbeta $. Therefore,

$\mb{L} \bbeta $ is estimable if and only if there is a linear combination of the rows of $\mb{X}$ that is equal to $\mb{L}$—that is, if and only if there is a $\mb{K}$ such that $\mb{L}=\mb{KX}$.

Thus, the rows of $\mb{X}$ form a generating set from which any estimable $\mb{L}$ can be constructed. Since the row space of $\mb{X}$ is the same as the row space of $\mb{X'X}$, the rows of $\mb{X'X}$ also form a generating set from which all estimable $\mb{L}$s can be constructed. Similarly, the rows of $(\mb{X'X})^{-}\mb{X'X}$ also form a generating set for $\mb{L}$.

Therefore, if $\mb{L}$ can be written as a linear combination of the rows of $\mb{X}$, $\mb{X'X}$, or $(\mb{X'X})^{-}\mb{X'X}$, then $\mb{L} \bbeta $ is estimable.

In the context of least squares regression and analysis of variance, an estimable linear function $\mb{L}\bbeta $ can be estimated by $\mb{L}\widehat{\bbeta }$, where $\widehat{\bbeta }=(\mb{X'X})^-\mb{X'Y}$. From the general theory of linear models, the unbiased estimator $\mb{L}\widehat{\bbeta }$ is, in fact, the best linear unbiased estimator of $\mb{L}\bbeta $, in the sense of having minimum variance as well as maximum likelihood when the residuals are normal. To test the hypothesis that $\mb{L}\bbeta =\mb{0}$, compute the sum of squares

\[ \mr{SS}(H_0\colon ~ \mb{L} \bbeta =\mb{0})=(\mb{L}\widehat{\bbeta })’ (\mb{L} (\mb{X'X})^{-}\mb{L}’)^{-1}\mb{L}\widehat{\bbeta } \]

and form an F test with the appropriate error term. Note that in contexts more general than least squares regression (for example, generalized and/or mixed linear models), linear hypotheses are often tested by analogous sums of squares of the estimated linear parameters $(\mb{L}\widehat{\bbeta })’(\mr{Var}[\mb{L}\widehat{\bbeta }])^{-1}\mb{L}\widehat{\bbeta }$.