Organization of the Input Data Set

The MVPMODEL procedure treats each observation in the DATA= data set as an individual multivariate observation. The observations do not need to be identified or sorted by time since the sequence of the data is not used to build the principal components model. However, it is recommended that you provide a time variable in the input data set because this variable is preserved in the OUT= data set and is subsequently required by the MVPMONITOR procedure to create control charts.

In basic applications of the MVPMODEL procedure, the observations in the DATA= data set represent measurements from a single process. You can build different principal components models for two or more processes by grouping their measurements in the DATA= data and processing them as BY groups.

In some applications, it is desirable to combine the data from two or more processes and build a common principal components model. This might be the case with processes that are peers in the sense that they are believed to share the same pattern of common cause variation. When you provide the MVPMONITOR procedure with a common model for a set of peer processes, it uses the model to construct identical control limits for each process. This enables you to decide whether a particular process exhibits unusual variation relative to the behavior of its peers.

To build a common model from measurements for multiple peer processes, your input data set should include a variable that identifies the peer processes and a time variable along with the process measurement variables that are specified with the VAR statement. You should sort the observations so that they are ordered first by time and then by peer. This enables you to specify the time variable with the TIMEGROUP= option, which requests that the mean and variance of the SPE statistic be computed for each time value. These statistics are then used by PROC MVPMONITOR to compute the control limits for SPE charts, as discussed in Example 10.1. A different method is used to compute the control limits in the case of individual observations per time group.


Note: This procedure is experimental.