

Polyserial correlation measures the correlation between two continuous variables with a bivariate normal distribution, where one variable is observed directly, and the other is unobserved. Information about the unobserved variable is obtained through an observed ordinal variable that is derived from the unobserved variable by classifying its values into a finite set of discrete, ordered values (Olsson, Drasgow, and Dorans, 1982).
Let X be the observed continuous variable from a normal distribution with mean 
 and variance 
, let Y be the unobserved continuous variable, and let 
 be the Pearson correlation between X and Y. Furthermore, assume that an observed ordinal variable D is derived from Y as follows: 
         
![\[  D = \;  \left\{  \begin{array}{ll} d_{(1)} &  \mr {if} \, \,  Y < \tau _{1} \\ d_{(k)} &  \mr {if} \, \,  \tau _{k-1} \leq Y < \tau _{k}, \; \,  k=2, 3, \ldots , K-1 \\ d_{(K)} &  \mr {if} \, \,  Y \geq \tau _{K-1} \end{array} \right.  \]](images/procstat_corr0183.png)
 where 
 are ordered observed values, and 
 are ordered unknown threshold values. 
         
The likelihood function for the joint distribution (X, D) from a sample of 
 observations 
 is 
         
 where 
 is the normal density function with mean 
 and standard deviation 
 (Drasgow, 1986). 
         
The conditional distribution of Y given 
 is normal with mean 
 and variance 
, where 
 is a standard normal variate. Without loss of generality, assume the variable Y has a standard normal distribution. Then if 
, the 
 ordered value in D, the resulting conditional density is 
         
![\[  P(D=d_{(k)} \;  | \;  x_ j) = \;  \left\{  \begin{array}{ll} \Phi \left( \frac{\tau _1 - \rho z_ j}{\sqrt {1-\rho ^2}} \right) &  \mr {if} \; \,  k=1 \\ \Phi \left( \frac{\tau _ k - \rho z_ j}{\sqrt {1-\rho ^2}} \right) - \Phi \left( \frac{\tau _{k-1} - \rho z_ j}{\sqrt {1-\rho ^2}} \right) &  \mr {if} \; \,  k=2, 3, \ldots , K-1 \\ 1 - \Phi \left( \frac{\tau _{K-1} - \rho z_ j}{\sqrt {1-\rho ^2}} \right) &  \mr {if} \; \,  k=K \end{array} \right.  \]](images/procstat_corr0197.png)
 where 
 is the cumulative normal distribution function. 
         
Cox (1974) derives the maximum likelihood estimates for all parameters 
, 
, 
  and 
, …, 
. The maximum likelihood estimates for 
 and 
 can be derived explicitly. The maximum likelihood estimate for 
 is the sample mean and the maximum likelihood estimate for 
 is the sample variance 
         
The maximum likelihood estimates for the remaining parameters, including the polyserial correlation 
 and thresholds 
, …, 
, can be computed by an iterative process, as described by Cox (1974). The asymptotic standard error of the maximum likelihood estimate of 
 can also be computed after this process. 
         
For a vector of parameters, the information matrix is the negative of the Hessian matrix (the matrix of second partial derivatives of the log likelihood function), and is used in the computation of the maximum likelihood estimates of these parameters. The CORR procedure uses the observed information matrix (the information matrix evaluated at the current parameter estimates) in the computation. After the maximum likelihood estimates are derived, the asymptotic covariance matrix for these parameter estimates is computed as the inverse of the observed information matrix (the information matrix evaluated at the maximum likelihood estimates).
The CORR procedure computes two types of testing for the zero polyserial correlation: the Wald test and the likelihood ratio (LR) test.
Given the maximum likelihood estimate of the polyserial correlation 
 and its asymptotic standard error 
, the Wald chi-square test statistic is computed as 
            
The Wald statistic has an asymptotic chi-square distribution with one degree of freedom.
For the LR test, the maximum likelihood function assuming zero polyserial correlation is also needed. If 
, the likelihood function is reduced to 
            
In this case, the maximum likelihood estimates for all parameters can be derived explicitly. The maximum likelihood estimates
               for 
 is the sample mean and the maximum likelihood estimate for 
 is the sample variance 
            
In addition, the maximum likelihood estimate for the threshold 
, 
, is 
            
 where 
 is the number of observations in the 
 ordered group of the ordinal variable 
, and 
 is the total number of observations. 
            
The LR test statistic is computed as
 where 
 is the likelihood function with the maximum likelihood estimates for all parameters, and 
 is the likelihood function with the maximum likelihood estimates for all parameters except the polyserial correlation, which
               is set to 0. The LR statistic also has an asymptotic chi-square distribution with one degree of freedom.