The HPCORR Procedure

Pearson Product-Moment Correlation

Subsections:

The Pearson product-moment correlation is a parametric measure of association for two variables. It measures both the strength and the direction of a linear relationship. If one variable X is an exact linear function of another variable Y, a positive relationship exists if the correlation is 1, and a negative relationship exists if the correlation is $-$1. If there is no linear predictability between the two variables, the correlation is 0. If the two variables are normal with a correlation 0, the two variables are independent. Correlation does not imply causality because, in some cases, an underlying causal relationship might not exist.

The formula for the population Pearson product-moment correlation, denoted ${\rho }_{xy}$, is

\[  {\rho }_{xy}=\frac{\mr {Cov}(x,y)}{\sqrt {\mr {V}(x) \mr {V}(y)}} = \frac{\mr {E}(\,  (x - \mr {E} (x)) (y - \mr {E} (y))\,  )}{\sqrt {\mr {E}(x-\mr {E}(x))^{2}\,  \mr {E}(y-\mr {E}(y))^{2}}}  \]

The sample correlation, such as a Pearson product-moment correlation or weighted product-moment correlation, estimates the population correlation. The formula for the sample Pearson product-moment correlation is as follows, where $\bar{x}$ is the sample mean of $x$ and $\bar{y}$ is the sample mean of $y$:

\[  r_{xy}=\frac{\sum _ i ( \, (x_ i-\bar{x})(y_ i-\bar{y})\, )}{\sqrt {\sum _{i}(x_ i-\bar{x})^{2} \,  \sum _{i}(y_ i-\bar{y})^2}}  \]

The formula for a weighted Pearson product-moment correlation is as follows, where $w_ i$ is the weight, $\bar{x}_ w$ is the weighted mean of $x$, and $\bar{y}_ w$ is the weighted mean of $y$:

\[  r_{xy}=\frac{\sum _ i \,  w_ i(x_ i-\bar{x}_ w)(y_ i-\bar{y}_ w)}{\sqrt {\sum _ i w_ i(x_ i-\bar{x}_ w)^2 \,  \sum _ i w_ i(y_ i-\bar{y}_ w)^2}}  \]

Probability Values

Probability values for the Pearson correlation are computed by treating the following equation as if it came from a t distribution with $(n-2)$ degrees of freedom, where $r$ is the sample correlation:

\[  t \,  = \,  {(n-2)}^{1/2} \,  {\left(\frac{r^{2}}{1-r^{2}}\right)}^{1/2}  \]

The partial variance-covariance matrix is calculated with the variance divisor (specified in the VARDEF= option). PROC HPCORR then uses the standard Pearson correlation formula on the partial variance-covariance matrix to calculate the Pearson partial correlation matrix.

When a correlation matrix is positive definite, the resulting partial correlation between variables x and y after adjusting for a single variable z is identical to that obtained from the following first-order partial correlation formula, where $r_{xy}$, $r_{xz}$, and $r_{yz}$ are the appropriate correlations:

\[  r_{xy.z}=\frac{r_{xy}-r_{xz}r_{yz}}{\sqrt {(1-r^{2}_{xz})(1-r^{2}_{yz})}}  \]

The formula for higher-order partial correlations is a straightforward extension of the preceding first-order formula. For example, when the correlation matrix is positive definite, the partial correlation between x and y that controls for both z_1 and z_2 is identical to the following second-order partial correlation formula, where $r_{xy.z_1}$, $r_{xz_2.z_1}$, and $r_{yz_2.z_1}$ are first-order partial correlations among variables x, y, and z_2 given z_1:

\[  r_{xy.z_1z_2} = \frac{r_{xy.z_1}-r_{xz_2.z_1}r_{yz_2.z_1}}{\sqrt {(1-r^2_{xz_2.z_1})(1-r^2_{yz_2.z_1})}}  \]