Two types of ellipses
can be computed for the input data (where observations correspond
to points in a scatter plot). One is a confidence ellipse for the
population mean (TYPE=MEAN), and the other is a prediction ellipse
for a new observation (TYPE=PREDICT). Both assume a bivariate normal
distribution.
Let
and
be the sample mean and sample covariance matrix
of a random sample of size
n from a bivariate normal distribution with mean
and covariance matrix
. The variable
is distributed as a bivariate normal variate with
mean zero and covariance
, and it is independent of
. Using Hotelling’s
statistic, which is defined as
a
confidence ellipse for
is computed from the equation
where
is the
critical value of an
distribution with degrees of freedom 2 and
.
A prediction ellipse
is a region for predicting a new observation in the population. It
also approximates a region containing a specified percentage of the
population.
Denote a new observation
as the bivariate random variable
. The variable
is distributed as a
bivariate normal variate with mean zero (the zero vector) and covariance
, and it is independent of
. A
prediction ellipse is then given by the equation
The family of ellipses
generated by different critical values of the
distribution has a common center (the sample mean)
and common major and minor axis directions.
The shape of an ellipse
depends on the aspect ratio of the plot. The ellipse indicates the
correlation between the two variables if the variables are standardized
(by dividing the variables by their respective standard deviations).
In this situation, the ratio between the major and minor axis lengths
is
In particular, if
, the ratio is 1, which corresponds to a circular
confidence contour and indicates that the variables are uncorrelated.
A larger value of the ratio indicates a larger positive or negative
correlation between the variables.