Two types
of ellipses can be computed for the input data (where observations
correspond to points in a scatter plot). One is a confidence ellipse
for the population mean (TYPE=MEAN), and the other is a prediction
ellipse for a new observation (TYPE=PREDICT). Both assume a bivariate
normal distribution.
Let
and
be the sample mean and sample covariance matrix
of a random sample of size
n from a bivariate normal distribution with mean
and covariance matrix
. The variable
is distributed as a bivariate normal variate with
mean zero and covariance
, and it is independent of
. Using Hotelling’s
statistic, which is defined as
a
confidence ellipse for
is computed from the equation
where
is the
critical value of an
distribution with degrees of freedom 2 and
.
A prediction
ellipse is a region for predicting a new observation in the population.
It also approximates a region containing a specified percentage of
the population.
Denote
a new observation as the bivariate random variable
. The variable
is distributed
as a bivariate normal variate with mean zero (the zero vector) and
covariance
, and it is independent of
. A
prediction ellipse is then given by the equation
The family
of ellipses generated by different critical values of the
distribution has a common center (the sample mean)
and common major and minor axis directions.
The shape
of an ellipse depends on the aspect ratio of the plot. The ellipse
indicates the correlation between the two variables if the variables
are standardized (by dividing the variables by their respective standard
deviations). In this situation, the ratio between the major and minor
axis lengths is
In particular,
if
, the ratio is 1, which corresponds to a circular
confidence contour and indicates that the variables are uncorrelated.
A larger value of the ratio indicates a larger positive or negative
correlation between the variables.