|
Expectations of Random Variables and Vectors
|
If
is a discrete random variable with mass function
and support (possible values)
, then the expectation (expected value) of
is defined as
provided that
, otherwise the sum in the definition is not well-defined. The expected value of a function
is similarly defined: provided that
,
For continuous random variables, similar definitions apply, but summation is replaced by integration over the support of the random variable. If
is a continuous random variable with density function
, and
, then the expectation of
is defined as
The expected value of a random variable is also called its mean or its first moment. A particularly important function of a random variable is
. The expectation of
is called the variance of
or the second central moment of
. When you study the properties of multiple random variables, then you might be interested in aspects of their joint distribution. The covariance between random variables
and
is defined as the expected value of the function
, where the expectation is taken under the bivariate joint distribution of
and
:
The covariance between a random variable and itself is the variance,
.
In statistical applications and formulas, random variables are often collected into vectors. For example, a random sample of size
from the distribution of
generates a random vector of order
,
The expected value of the
random vector
is the vector of the means of the elements of
:
It is often useful to directly apply rules about working with means, variances, and covariances of random vectors. To develop these rules, suppose that
and
denote two random vectors with typical elements
and
. Further suppose that
and
are constant (nonstochastic) matrices, that
is a constant vector, and that the
are scalar constants.
The following rules enable you to derive the mean of a linear function of a random vector:
The covariance matrix of
and
is the
matrix whose typical element in row
, column
is the covariance between
and
. The covariance matrix between two random vectors is frequently denoted with the
"operator."
The variance matrix of a random vector
is the covariance matrix between
and itself. The variance matrix is frequently denoted with the
"operator."
Because the variance matrix contains variances on the diagonal and covariances in the off-diagonal positions, it is also referred to as the variance-covariance matrix of the random vector
.
If the elements of the covariance matrix
are zero, the random vectors are uncorrelated. If
and
are normally distributed, then a zero covariance matrix implies that the vectors are stochastically independent. If the off-diagonal elements of the variance matrix
are zero, the elements of the random vector
are uncorrelated. If
is normally distributed, then a diagonal variance matrix implies that its elements are stochastically independent.
Suppose that
and
are constant (nonstochastic) matrices and that
denotes a scalar constant. The following results are useful in manipulating covariance matrices:
Since
, these results can be applied to produce the following results, useful in manipulating variances of random vectors:
Another area where expectation rules are helpful is quadratic forms in random variables. These forms arise particularly in the study of linear statistical models and in linear statistical inference. Linear inference is statistical inference about linear function of random variables, even if those random variables are defined through nonlinear models. For example, the parameter estimator
might be derived in a nonlinear model, but this does not prevent statistical questions from being raised that can be expressed through linear functions of
; for example,
if
is a matrix of constants and
is a random vector, then