MANOVA Statement |
If the MODEL statement includes more than one dependent variable, you can perform multivariate analysis of variance with the MANOVA statement. The test-options define which effects to test, while the detail-options specify how to execute the tests and what results to display.
When a MANOVA statement appears before the first RUN statement, PROC ANOVA enters a multivariate mode with respect to the handling of missing values; in addition to observations with missing independent variables, observations with any missing dependent variables are excluded from the analysis. If you want to use this mode of handling missing values but do not need any multivariate analyses, specify the MANOVA option in the PROC ANOVA statement.
You can specify the following options in the MANOVA statement as test-options in order to define which multivariate tests to perform.
specifies effects in the preceding model to use as hypothesis matrices. For each SSCP matrix associated with an effect, the H= specification computes an analysis based on the characteristic roots of , where is the matrix associated with the error effect. The characteristic roots and vectors are displayed, along with the Hotelling-Lawley trace, Pillai’s trace, Wilks’ lambda, and Roy’s greatest root. By default, these statistics are tested with approximations based on the distribution. To test them with exact (but computationally intensive) calculations, use the MSTAT=EXACT option.
Use the keyword INTERCEPT to produce tests for the intercept. To produce tests for all effects listed in the MODEL statement, use the keyword _ALL_ in place of a list of effects.
For background and further details, see the section Multivariate Analysis of Variance in Chapter 41, The GLM Procedure.
specifies the error effect. If you omit the E= specification, the ANOVA procedure uses the error SSCP (residual) matrix from the analysis.
specifies a transformation matrix for the dependent variables listed in the MODEL statement. The equations in the M= specification are of the form
where the values are coefficients for the various dependent-variables. If the value of a given is 1, it can be omitted; in other words is the same as . Equations should involve two or more dependent variables. For sample syntax, see the section Examples.
Alternatively, you can input the transformation matrix directly by entering the elements of the matrix with commas separating the rows, and parentheses surrounding the matrix. When this alternate form of input is used, the number of elements in each row must equal the number of dependent variables. Although these combinations actually represent the columns of the matrix, they are displayed by rows.
When you include an M= specification, the analysis requested in the MANOVA statement is carried out for the variables defined by the equations in the specification, not the original dependent variables. If you omit the M= option, the analysis is performed for the original dependent variables in the MODEL statement.
If an M= specification is included without either the MNAMES= or the PREFIX= option, the variables are labeled MVAR1, MVAR2, and so forth by default.
For further information, see the section Multivariate Analysis of Variance in Chapter 41, The GLM Procedure.
provides names for the variables defined by the equations in the M= specification. Names in the list correspond to the M= equations or the rows of the matrix (as it is entered).
is an alternative means of identifying the transformed variables defined by the M= specification. For example, if you specify PREFIX=DIFF, the transformed variables are labeled DIFF1, DIFF2, and so forth.
You can specify the following options in the MANOVA statement after a slash as detail-options:
produces a canonical analysis of the and matrices (transformed by the matrix, if specified) instead of the default display of characteristic roots and vectors.
specifies the method of evaluating the multivariate test statistics. The default is MSTAT=FAPPROX, which specifies that the multivariate tests are evaluated by using the usual approximations based on the distribution, as discussed in the "Multivariate Tests" section in Chapter 4, Introduction to Regression Procedures. Alternatively, you can specify MSTAT=EXACT to compute exact p-values for three of the four tests (Wilks’ lambda, the Hotelling-Lawley trace, and Roy’s greatest root) and an improved F-approximation for the fourth (Pillai’s trace). While MSTAT=EXACT provides better control of the significance probability for the tests, especially for Roy’s Greatest Root, computations for the exact p-values can be appreciably more demanding, and are in fact infeasible for large problems (many dependent variables). Thus, although MSTAT=EXACT is more accurate for most data, it is not the default method. For more information about the results of MSTAT=EXACT, see the section Multivariate Analysis of Variance in Chapter 41, The GLM Procedure.
requests that the transformation matrix in the M= specification of the MANOVA statement be orthonormalized by rows before the analysis.
displays the error SSCP matrix . If the matrix is the error SSCP (residual) matrix from the analysis, the partial correlations of the dependent variables given the independent variables are also produced.
For example, the statement
manova / printe;
displays the error SSCP matrix and the partial correlation matrix computed from the error SSCP matrix.
displays the hypothesis SSCP matrix associated with each effect specified by the H= specification.
produces analysis-of-variance tables for each dependent variable. When no M matrix is specified, a table is produced for each original dependent variable from the MODEL statement; with an matrix other than the identity, a table is produced for each transformed variable defined by the matrix.
The following statements give several examples of using a MANOVA statement.
proc anova; class A B; model Y1-Y5=A B(A); manova h=A e=B(A) / printh printe; manova h=B(A) / printe; manova h=A e=B(A) m=Y1-Y2,Y2-Y3,Y3-Y4,Y4-Y5 prefix=diff; manova h=A e=B(A) m=(1 -1 0 0 0, 0 1 -1 0 0, 0 0 1 -1 0, 0 0 0 1 -1) prefix=diff; run;
The first MANOVA statement specifies A as the hypothesis effect and B(A) as the error effect. As a result of the PRINTH option, the procedure displays the hypothesis SSCP matrix associated with the A effect; and, as a result of the PRINTE option, the procedure displays the error SSCP matrix associated with the B(A) effect.
The second MANOVA statement specifies B(A) as the hypothesis effect. Since no error effect is specified, PROC ANOVA uses the error SSCP matrix from the analysis as the matrix. The PRINTE option displays this matrix. Since the matrix is the error SSCP matrix from the analysis, the partial correlation matrix computed from this matrix is also produced.
The third MANOVA statement requests the same analysis as the first MANOVA statement, but the analysis is carried out for variables transformed to be successive differences between the original dependent variables. The PREFIX=DIFF specification labels the transformed variables as DIFF1, DIFF2, DIFF3, and DIFF4.
Finally, the fourth MANOVA statement has the identical effect as the third, but it uses an alternative form of the M= specification. Instead of specifying a set of equations, the fourth MANOVA statement specifies rows of a matrix of coefficients for the five dependent variables.
As a second example of the use of the M= specification, consider the following:
proc anova; class group; model dose1-dose4=group / nouni; manova h = group m = -3*dose1 - dose2 + dose3 + 3*dose4, dose1 - dose2 - dose3 + dose4, -dose1 + 3*dose2 - 3*dose3 + dose4 mnames = Linear Quadratic Cubic / printe; run;
The M= specification gives a transformation of the dependent variables dose1 through dose4 into orthogonal polynomial components, and the MNAMES= option labels the transformed variables as LINEAR, QUADRATIC, and CUBIC, respectively. Since the PRINTE option is specified and the default residual matrix is used as an error term, the partial correlation matrix of the orthogonal polynomial components is also produced.
For further information, see the section Multivariate Analysis of Variance in Chapter 41, The GLM Procedure.