The MVPMONITOR procedure accepts a single primary input data set of one of three types.
A DATA= data set contains new process data to be analyzed by using an existing PCA model (Phase II analysis).
A HISTORY= data set contains process data and the accompanying scores, residuals, and statistics produced by applying a PCA model. The process data can be the original data that was used to create the model (Phase I analysis) or subsequent data that was analyzed by using a previously created model (Phase II analysis).
A TABLE= data set contains a summary of an SPE or control chart, consisting of the statistics, control limits, and other information.
These options are mutually exclusive. If you do not specify an option identifying a primary input data set, PROC MVPMONITOR uses the most recently created SAS data set as a DATA= data set.
When you specify a DATA= data set, you must also specify a LOADINGS= data set that contains loadings and other information describing the PCA model. When you specify a HISTORY= data set, you must also specify a LOADINGS= data set if you specify the CONTRIBUTIONS option in a TSQUARECHART statement.
A DATA= data set provides the process measurement data for a Phase II analysis. In addition to the process variables, a DATA= data set can include the following:
When you specify a DATA= data set, you must also specify a LOADINGS= data set that contains the loadings for the principal component model that describes the variation of the process. These loadings are used to score the new data from the DATA= data set. The process variables in the LOADINGS= data set must have the same names as those in the DATA= data set.
A HISTORY= data set provides the input data set for a Phase I or Phase II analysis. In addition to the original process variables, it contains principal component scores, residuals, SPE and statistics, and a count of the observations that are used to construct the principal component model, as summarized in Table 13.4.
Table 13.4: Variables in the HISTORY= Data Set
Variable |
Description |
---|---|
Prin1–Prinj |
Principal component scores |
R_–R_ |
Residuals |
_NOBS_ |
Number of observations used to build the principal component model |
_SPE_ |
Squared prediction error (SPE) |
_TSQUARE_ |
statistic computed from principal component scores |
A HISTORY= data set must include variables that contain principal component scores. The score variables names must consist
of a common prefix followed by the numbers 1, 2, …, j, where j is the number of principal components. By default, the common prefix is Prin
. You can use the PREFIX= option to specify another prefix for score variables.
If the number of principal components is less than the total number of process variables, the HISTORY= data set should also
contain residual variables. A residual variable name consists of a common prefix followed by the corresponding process variable
name. The default residual variable prefix is R_
. For example, if the process variables are A
, B
, and C
, the default residual variable names are R_A
, R_B
, and R_C
. You can use the RPREFIX= option to specify a different residual variable prefix.
Note: Usually you create a HISTORY= data set by specifying the PROC MVPMODEL OUT= option or the PROC MVPMONITOR OUTHISTORY= option. If the PREFIX= or RPREFIX= option is used when such an output data set is created, you must specify the same prefixes to identify the score and residual variables when you read it as a HISTORY= data set.
The LOADINGS= data set contains the following information about the principal component model:
eigenvalues of the correlation or covariance matrix used to construct the model
principal component loadings
process variable means used to center the variable values
process variable standard deviations used to scale the variable values
You can produce a LOADINGS= data set by using the PROC MVPMODEL OUTLOADINGS= option. Table 13.5 lists the variables that are required in a LOADINGS= data set.
Table 13.5: Variables in the LOADINGS= Data Set
Variable |
Description |
---|---|
_VALUE_ |
The value contained in process variables for a given observation |
_NOBS_ |
Number of observations used to build the principal component model |
_PC_ |
Principal component number; 0 for the observation that contains eigenvalues |
process variables |
Values associated with the process variables |
Valid values for the _VALUE_
variable are as follows:
eigenvalues from the principal component analysis
principal component loadings
process variable means
process variable standard deviations
The LOADINGS= data set contains one EIGEN observation and j LOADING observations, where j is the number of principal components in the model. The presence of a MEAN observation indicates that the process variables were centered when the principal component model was constructed, and the presence of a STD observation indicates that the process variables were scaled when the principal component model was constructed. The means and standard deviations are used to center and scale new data in a Phase II analysis.
A TABLE= data set contains a summary of an SPE or control chart. Usually, you create a TABLE= data set by specifying the OUTTABLE= option in an SPECHART or TSQUARECHART statement. A TABLE= data set contains the variables listed in Table 13.6.
Table 13.6: Variables in an OUTTABLE= Data Set
Variable |
Description |
---|---|
|
Probability () of exceeding control limits |
|
Flag to indicate control limit was exceeded |
|
Lower control limit |
|
Center line |
|
Squared prediction error (SPE) statistic (SPECHART statement only) |
time |
Optional TIME variable |
|
statistic (TSQUARECHART statement only) |
|
Upper control limit |
A TABLE= data set must contain either an _SPE_
or _TSQUARE_
variable but not both. When you use a TABLE= input data set, you can specify only chart statements that correspond to the
statistic in the data set.
You can use a TABLE= data set to display a previously created control chart or to specify custom control limits by computing
your own _LCL_
and _UCL_
values.