The CORR Procedure

Polyserial Correlation

Subsections:

Probability Values

Polyserial correlation measures the correlation between two continuous variables with a bivariate normal distribution, where one variable is observed directly, and the other is unobserved. Information about the unobserved variable is obtained through an observed ordinal variable that is derived from the unobserved variable by classifying its values into a finite set of discrete, ordered values (Olsson, Drasgow, and Dorans, 1982).

Let X be the observed continuous variable from a normal distribution with mean $\mu$ and variance $\sigma ^{2}$ , let Y be the unobserved continuous variable, and let $\rho$ be the Pearson correlation between X and Y. Furthermore, assume that an observed ordinal variable D is derived from Y as follows:

$D = \; \left\{ \begin{array}{ll} d_{(1)} & \mr {if} \, \, Y < \tau _{1} \\ d_{(k)} & \mr {if} \, \, \tau _{k-1} \leq Y < \tau _{k}, \; \, k=2, 3, \ldots , K-1 \\ d_{(K)} & \mr {if} \, \, Y \geq \tau _{K-1} \end{array} \right.$

where $d_{(1)} < d_{(2)} < \ldots < d_{(K)}$ are ordered observed values, and $\tau _1 < \tau _2 < \ldots < \tau _{K-1}$ are ordered unknown threshold values.

The likelihood function for the joint distribution (X, D) from a sample of $N$ observations $(x_ j, d_ j)$ is

$L = \prod _{j=1}^{N} f( x_ j, d_ j) = \prod _{j=1}^{N} f(x_ j) \; P(D=d_ j \; | \; x_ j)$

where $f(x_ j)$ is the normal density function with mean $\mu$ and standard deviation $\sigma$ (Drasgow, 1986).

The conditional distribution of Y given $X=x_ j$ is normal with mean $\rho z_ j$ and variance $1-\rho ^{2}$ , where $z_ j= (x_ j - \mu ) / \sigma$ is a standard normal variate. Without loss of generality, assume the variable Y has a standard normal distribution. Then if $d_ j = d_{(k)}$ , the $k^{th}$ ordered value in D, the resulting conditional density is

$P(D=d_{(k)} \; | \; x_ j) = \; \left\{ \begin{array}{ll} \Phi \left( \frac{\tau _1 - \rho z_ j}{\sqrt {1-\rho ^2}} \right) & \mr {if} \; \, k=1 \\ \Phi \left( \frac{\tau _ k - \rho z_ j}{\sqrt {1-\rho ^2}} \right) - \Phi \left( \frac{\tau _{k-1} - \rho z_ j}{\sqrt {1-\rho ^2}} \right) & \mr {if} \; \, k=2, 3, \ldots , K-1 \\ 1 - \Phi \left( \frac{\tau _{K-1} - \rho z_ j}{\sqrt {1-\rho ^2}} \right) & \mr {if} \; \, k=K \end{array} \right.$

where $\Phi$ is the cumulative normal distribution function.

Cox (1974) derives the maximum likelihood estimates for all parameters $\mu$ , $\sigma$ , $\rho$ and $\tau _1$ , …, $\tau _{k-1}$ . The maximum likelihood estimates for $\mu$ and $\sigma ^2$ can be derived explicitly. The maximum likelihood estimate for $\mu$ is the sample mean and the maximum likelihood estimate for $\sigma ^2$ is the sample variance

$\frac{\sum _{j=1}^{N} (x_ j - \bar{x})^{2}}{N}$

The maximum likelihood estimates for the remaining parameters, including the polyserial correlation $\rho$ and thresholds $\tau _1$ , …, $\tau _{k-1}$ , can be computed by an iterative process, as described by Cox (1974). The asymptotic standard error of the maximum likelihood estimate of $\rho$ can also be computed after this process.

For a vector of parameters, the information matrix is the negative of the Hessian matrix (the matrix of second partial derivatives of the log likelihood function), and is used in the computation of the maximum likelihood estimates of these parameters. The CORR procedure uses the observed information matrix (the information matrix evaluated at the current parameter estimates) in the computation. After the maximum likelihood estimates are derived, the asymptotic covariance matrix for these parameter estimates is computed as the inverse of the observed information matrix (the information matrix evaluated at the maximum likelihood estimates).

Probability Values

The CORR procedure computes two types of testing for the zero polyserial correlation: the Wald test and the likelihood ratio (LR) test.

Given the maximum likelihood estimate of the polyserial correlation $\hat{\rho }$ and its asymptotic standard error $\mr {StdErr}(\hat{\rho })$ , the Wald chi-square test statistic is computed as

$\left( \frac{\hat{\rho }}{\mr {StdErr}(\hat{\rho })} \right)^{2}$

The Wald statistic has an asymptotic chi-square distribution with one degree of freedom.

For the LR test, the maximum likelihood function assuming zero polyserial correlation is also needed. If $\rho =0$ , the likelihood function is reduced to

$L = \prod _{j=1}^{N} f( x_ j, d_ j) = \prod _{j=1}^{N} f(x_ j) \; \prod _{j=1}^{N} P(D=d_ j)$

In this case, the maximum likelihood estimates for all parameters can be derived explicitly. The maximum likelihood estimates for $\mu$ is the sample mean and the maximum likelihood estimate for $\sigma ^2$ is the sample variance

$\frac{\sum _{j=1}^{N} (x_ j - \bar{x})^{2}}{N}$

In addition, the maximum likelihood estimate for the threshold $\tau _ k$ , $k=1, \ldots , K-1$ , is

$\Phi ^{-1} \left( \frac{\sum _{g=1}^{k} n_ g}{N} \right)$

where $n_ g$ is the number of observations in the $g^{th}$ ordered group of the ordinal variable $D$ , and $N=\sum _{g=1}^{K} n_ g$ is the total number of observations.

The LR test statistic is computed as

$-2 \; \log \, \left( \frac{L_0}{L_1} \right)$

where $L_1$ is the likelihood function with the maximum likelihood estimates for all parameters, and $L_0$ is the likelihood function with the maximum likelihood estimates for all parameters except the polyserial correlation, which is set to 0. The LR statistic also has an asymptotic chi-square distribution with one degree of freedom.