The HPPLS Procedure


Note that each extracted PLS factor is defined in terms of different X-variables $\mb{X}_ i$. This leads to difficulties in comparing different scores, weights, and so on. The SIMPLS method of De Jong (1993) overcomes these difficulties by computing each score $\mb{t}_ i=\mb{X}\mb{r}_ i$ in terms of the original (centered and scaled) predictors $\mb{X}$. The SIMPLS X-weight vectors $r_ i$ are similar to the eigenvectors of $\mb{S}\mb{S}’ = \mb{X}’\mb{Y}\mb{Y}’\mb{X}$, but they satisfy a different orthogonality condition. The $\mb{r}_1$ vector is just the first eigenvector $\mb{e}_1$ (so that the first SIMPLS score is the same as the first PLS score). However, the second eigenvector maximizes

\[  \mb{e}_1’\bS \bS ’\mb{e}_2~  \mbox{subject to}~ \mb{e}_1’\mb{e}_2=0  \]

whereas the second SIMPLS weight $\mb{r}_2$ maximizes

\[  \mb{r}_1’SS’\mb{r}_2~  \mbox{subject to}~ \mb{r}_1’\bX ’\bX \mb{r}_2 = \mb{t}_1’\mb{t}_2 = 0  \]

The SIMPLS scores are identical to the PLS scores for one response but slightly different for more than one response; see De Jong (1993) for details. The X- and Y-loadings are defined as in PLS, but because the scores are all defined in terms of $\mb{X}$, it is easy to compute the overall model coefficients $\mb{B}$:

\begin{eqnarray*}  \hat{\mb{Y}} &  = &  \sum _ i \mb{t}_ i\mb{c}_ i’ \\ &  = &  \sum _ i \mb{X}\mb{r}_ i\mb{c}_ i’ \\ &  = &  \mb{X}\mb{B},~ \textrm{where}~ \mb{B}~ =~ \mb{R}\mb{C}’ \end{eqnarray*}