Type III and IV SS and Estimable Functions |
When an effect is contained in another effect, the Type II hypotheses for that effect are dependent on the cell frequencies. The philosophy behind both the Type III and Type IV hypotheses is that the hypotheses tested for any given effect should be the same for all designs with the same general form of estimable functions.
To demonstrate this concept, recall the hypotheses being tested by the Type II SS in the balanced factorial shown in Table 15.6. Those hypotheses are precisely the ones that the Type III and Type IV hypotheses employ for all factorials that have at least one observation per cell. The Type III and Type IV hypotheses for a design without missing cells usually differ from the hypothesis employed for the same design with missing cells since the general form of estimable functions usually differs.
Many SAS/STAT procedures can perform tests of Type III hypotheses, but only PROC GLM offers Type IV tests as well.
Type III hypotheses are constructed by working directly with the general form of estimable functions. The following steps are used to construct a hypothesis for an effect :
For every effect in the model except and those effects that contain , equate the coefficients in the general form of estimable functions to zero.
If is not contained in any other effect, this step defines the Type III hypothesis (as well as the Type II and Type IV hypotheses). If is contained in other effects, go on to step 2. (See the section Type II SS and Estimable Functions for a definition of when effect is contained in another effect.)
If necessary, equate new symbols to compound expressions in the block in order to obtain the simplest form for the coefficients.
Equate all symbolic coefficients outside the block to a linear function of the symbols in the block in order to make the hypothesis orthogonal to hypotheses associated with effects that contain .
By once again observing the Type II hypotheses being tested in the balanced factorial, it is possible to verify that the and hypotheses are orthogonal and also that the and hypotheses are orthogonal. This principle of orthogonality between an effect and any effect that contains it holds for all balanced designs. Thus, construction of Type III hypotheses for any design is a logical extension of a process that is used for balanced designs.
The Type III hypotheses are precisely the hypotheses being tested by programs that reparameterize using the usual assumptions (for example, constraining all parameters for an effect to sum to zero). When no missing cells exist in a factorial model, Type III SS coincide with Yates’ weighted squares-of-means technique. When cells are missing in factorial models, the Type III SS coincide with those discussed in Harvey (1960) and Henderson (1953).
The following discussion illustrates the construction of Type III estimable functions for a factorial with no missing cells.
To obtain the interaction hypothesis, start with the general form and equate the coefficients for effects , , and to zero, as shown in Table 15.8.
Effect |
General Form |
|
||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
The last column in Table 15.8 represents the form of the MRH for .
To obtain the Type III hypothesis for , first start with the general form and equate the coefficients for effects and to zero (let ). Next let , and find the value of K that makes the A hypothesis orthogonal to the A*B hypothesis. In this case, . Each of these steps is shown in Table 15.9.
In Table 15.9, the fourth column (under ) represents the form of all estimable functions not involving , , or . The prime difference between the Type II and Type III hypotheses for is the way is determined. Type II chooses as a function of the cell frequencies, whereas Type III chooses such that the estimable functions for are orthogonal to the estimable functions for .
Effect |
General Form |
|
|
|
||||
|
|
|
|
|||||
|
|
|
|
|||||
|
|
|
|
|||||
|
|
|
|
|||||
|
|
|
|
|||||
|
|
|
|
|||||
|
|
|
|
|||||
|
|
|
|
|||||
|
|
|
|
An example of Type III estimable functions in a factorial with unequal cell frequencies and missing diagonals is given in Table 15.10 ( through represent the nonzero cell frequencies).
|
|
|||
|
1 |
2 |
3 |
|
1 |
|
|
||
|
2 |
|
|
|
3 |
|
|
For any nonzero values of through , the Type III estimable functions for each effect are shown in Table 15.11.
Effect |
|
|
|
|||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
By once again looking at the Type II hypotheses being tested in the balanced factorial (see Table 15.6), you can see another characteristic of the hypotheses employed for balanced designs: the coefficients of lower-order effects are averaged across each higher-level effect involving the same subscripts. For example, in the hypothesis, the coefficients of and are equal to one-half the coefficient of , and the coefficients of and are equal to one-half the coefficient of . With this in mind, the basic concept used to construct Type IV hypotheses is that the coefficients of any effect, say , are distributed equitably across higher-level effects that contain . When missing cells occur, this same general philosophy is adhered to, but care must be taken in the way the distributive concept is applied.
Construction of Type IV hypotheses begins as does the construction of the Type III hypotheses. That is, for an effect , equate to zero all coefficients in the general form that do not belong to or to any other effect containing . If is not contained in any other effect, then the Type IV hypothesis (and Type II and III) has been found. If is contained in other effects, then simplify, if necessary, the coefficients associated with so that they are all free coefficients or functions of other free coefficients in the block.
To illustrate the method of resolving the free coefficients outside the block, suppose that you are interested in the estimable functions for an effect and that is contained in , , and . (In other words, the main effects in the model are , , and .)
With missing cells, the coefficients of intermediate effects (here they are and ) do not always have an equal distribution of the lower-order coefficients, so the coefficients of the highest-order effects are determined first (here it is ). Once the highest-order coefficients are determined, the coefficients of intermediate effects are automatically determined.
The following process is performed for each free coefficient of in turn. The resulting symbolic vectors are then added together to give the Type IV estimable functions for .
Select a free coefficient of , and set all other free coefficients of to zero.
If any of the levels of have zero as a coefficient, equate all of the coefficients of higher-level effects involving that level of to zero. This step alone usually resolves most of the free coefficients remaining.
Check to see if any higher-level coefficients are now zero when the coefficient of the associated level of is not zero. If this situation occurs, the Type IV estimable functions for are not unique.
For each level of in turn, if the coefficient for that level is nonzero, count the number of times that level occurs in the higher-level effect. Then equate each of the higher-level coefficients to the coefficient of that level of divided by the count.
An example of a factorial with four missing cells ( through represent positive cell frequencies) is shown in Table 15.12.
|
|
|||
|
1 |
2 |
3 |
|
1 |
|
|
||
|
2 |
|
|
|
3 |
|
The Type IV estimable functions are shown in Table 15.13.
Effect |
|
|
|
|||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
||||
|
|
|
For the vast majority of designs, Type III and Type IV hypotheses for a given effect are the same. Specifically, they are the same for any effect that is not contained in other effects for any design (with or without missing cells). For factorial designs with no missing cells, the Type III and Type IV hypotheses coincide for all effects. When there are missing cells, the hypotheses can differ. By using the GLM procedure, you can study the differences in the hypotheses and then decide on the appropriateness of the hypotheses for a particular model.
The Type III hypotheses for three-factor and higher completely nested designs with unequal s in the lowest level differ from the Type II hypotheses; however, the Type IV hypotheses do correspond to the Type II hypotheses in this case.
When missing cells occur in a design, the Type IV hypotheses might not be unique. If this occurs in PROC GLM, you are notified, and you might need to consider defining your own specific comparisons.