The CALIS Procedure

The COSAN Model

The original COSAN (covariance structure analysis) model is proposed by McDonald (1978, 1980) for analyzing general covariance structure models. PROC CALIS enables you to analyze a generalized form of the original COSAN model. The generalized COSAN model extends the original COSAN model with the inclusion of addition terms in the covariance structure formula and the associated mean structure formula.

The covariance structure formula of the generalized COSAN model is

\[ \bSigma = \mb{F}_1 \mb{P}_1 \mb{F}_1^{\prime } + \cdots + \mb{F}_ m \mb{P}_ m \mb{F}_ m^{\prime } \]

and the corresponding mean structure formula of the generalized COSAN model is

\[ \bmu = \mb{F}_1 \mb{v}_1 + \cdots + \mb{F}_ m \mb{v}_ m \]

where $\bSigma $ is a symmetric correlation or covariance matrix for the observed variables, $\bmu $ is a vector for the observed variable means, each $\mb{P}_ k$ is a symmetric matrix, each $\mb{v}_ k$ is a mean vector, and each $\mb{F}_ k$ ($k=1,\ldots ,m,$) is the product of $n(k)$ matrices $\mb{F}_{k_1},\ldots ,\mb{F}_{k_{n(k)}}$; that is,

\[ \mb{F}_ k = \mb{F}_{k_{1}} \cdots \mb{F}_{k_{n(k)}}, \quad k=1,\ldots ,m \]

The matrices $\mb{F}_{k_ j}$ and $\mb{P}_ k$ in the model can be one of the forms

\[ \mb{F}_{k_{j}} = \left\{ \begin{matrix} \mb{G}_{k_{j}} \\ \mb{G}^{-1}_{k_{j}} \\ (\mb{I} - \mb{G}_{k_{j}})^{-1} \\ \end{matrix} \quad j = 1, \ldots , n(k) \qquad \mbox{ and } \quad \right. \mb{P}_ k = \left\{ \begin{matrix} \mb{Q}_{k} \\ \mb{Q}^{-1}_ k \end{matrix} \right. \]

where $\mb{G}_{k_ j}$ and $\mb{Q}_ k$ are basic model matrices that are not expressed as functions of other matrices.

The COSAN model matrices and vectors are $\mb{G}_{k_ j}$, $\mb{Q}_ k$, and $\mb{v}_ k$ (when the mean structures are analyzed). The elements of these model matrices and vectors are either parameters (free or constrained) or fixed values. Matrix $\mb{P}_ k$ is referred to as the central covariance matrix for the kth term in the covariance structure formula.

Essentially, the COSAN modeling language enables you to define the covariance and mean structure formulas of the generalized COSAN model, the basic COSAN model matrices $\mb{G}_{k_ j}$, $\mb{Q}_ k$, and $\mb{v}_ k$, and the parameters and fixed values in the model matrices.

You can also specify a generalized COSAN model without using an explicit central covariance matrix in any term. For example, you can define the kth term in the covariance structure formula as

\[ \mb{F}_ k \mb{F}_ k^{\prime } = \mb{F}_{k_1} \ldots \mb{F}_{k_{n-1}} \mb{F}_{k_ n} \mb{F}_{k_ n}^{\prime } \mb{F}_{k_{n-1}}^{\prime } \ldots \mb{F}_{k_1}^{\prime } \]

The corresponding term for the mean structure becomes

\[ \mb{F}_{k_1} \ldots \mb{F}_{k_{n-1}} \mb{v}_ m \]

In the covariance structure formula, $\mb{F}_{k_ n} \mb{F}_{k_ n}^{\prime }$ serves as an implicit central covariance matrix in this term of the covariance structure formula. Because of this, $\mb{F}_{k_ n}$ does not appear in the corresponding mean structure formula.

To take advantage of the modeling flexibility of the COSAN model specifications, you are required to provide the correct covariance and mean structure formulas for the analysis problem. If you are not familiar with the mathematical formulations of structural equation models, you can consider using simpler modeling languages such as PATH or LINEQS.

An Example: Specifying a Second-Order Factor Model

This example illustrates how to specify the covariance structures in the COSAN statement. Consider a second-order factor analysis model with the following formula for the covariance structures of observed variables v1v9

\[ \bSigma = \mb{F}_1 ( \mb{F}_2 \mb{P}_2 \mb{F}_2^{\prime } + \mb{U}_2 ) \mb{F}_1^{\prime } + \mb{U}_1 \]

where $\mb{F}_1$ is a $9 \times 3$ first-order factor matrix, $\mb{F}_2$ is a $3 \times 2$ second-order factor matrix, $\mb{P}_2$ is a $2 \times 2$ covariance matrix for the second-order factors, $\mb{U}_2$ is a $3 \times 3$ diagonal matrix for the unique variances of the first-order factors, and $\mb{U}_1$ is a $9 \times 9$ diagonal matrix for the unique variances of the observed variables.

To fit this covariance structure model, you first rewrite the covariance structure formula in the form of the generalized COSAN model as

\[ \bSigma = \mb{F}_1 \mb{F}_2 \mb{P}_2 \mb{F}_2^{\prime } \mb{F}_1^{\prime } + \mb{F}_1 \mb{U}_2 \mb{F}_1^{\prime } + \mb{U}_1 \]

You can specify the list of observed variables and the three terms for the covariance structure formula in the following COSAN statement:

cosan var= v1-v9,
      F1(3) * F2(2) * P2(2,SYM) + F1(3) * U2(3,DIA) + U1(9,DIA);

The VAR= option specifies the nine observed variables in the model. Next, the three terms of the covariance structure formula are specified. Because each term in the covariance structure formula is a symmetric product, you only need to specify each term up to the central covariance matrix. For example, although the first term in the covariance structure formula is $\mb{F}_1 \mb{F}_2 \mb{P}_2 \mb{F}_2^{\prime } \mb{F}_1^{\prime }$, you only need to specify F1(3) * F2(2) * P2(2,SYM). PROC CALIS generates the redundant information for the term. Similarly, you specify the other two terms of the covariance structure formula.

In each matrix specification of the COSAN statement, you can specify the following three matrix properties as the arguments in the trailing parentheses: the number of columns, the matrix type, and the transformation of the matrix. For example, F1(3) means that the number of columns of F1 is 3 (while the number of rows is 9 because this number has to match the number of observed variables specified in the VAR= option), F2(2) means that the number of columns of F2 is 2 (while the number of rows is 3 because the number has to match the number of columns of the preceding matrix, F1). You can specify the type of the matrix in the second argument. For example, P2(2,SYM) means that P2 is a symmetric (SYM) matrix and U2(2,DIA) means that U2 is a diagonal (DIA) matrix. You can also specify the transformation of the matrix in the third argument. Because there is no transformation needed in the current second-order factor model, this argument is omitted in the specification. See the COSAN statement for details about the matrix types and transformation that are supported by the COSAN modeling language.

Suppose now you also want to analyze the mean structures of the second-order factor model. The corresponding mean structure formula is

\[ \bmu = \mb{F}_1 \mb{F}_2 \mb{v} + \mb{u} \]

where $\mb{v}$ is a $2 \times 1$ mean vector for the second-order factors and $\mb{u}$ is a $6 \times 1$ vector for the intercepts of the observed variables. To analyze the mean and covariance structures simultaneously, you can use the following COSAN statement:

cosan var= v1-v9,
      F1(3) * F2(2) * P2(2,SYM) [mean = v] + F1(3) * U2(3,DIA)
      + U1(9,DIA) [mean = u];

In addition to the covariance structure specified, you now add the trailing MEAN= options in the first and the third terms. PROC CALIS then generates the mean structure formula by the following steps:

  • Remove the last matrix (that is, the central covariance matrix) in each term of the covariance structure formula.

  • Append to each term the vector that is specified in the MEAN= option of the term, or if no MEAN= option is specified in a term, that term becomes a zero vector in the mean structure formula.

Following these steps, the mean structure formula generated for the second-order factor model is

\[ \bmu = \mb{F}_1 \mb{F}_2 \mb{v} + 0 + \mb{u} \]

which is what you expect for the mean structures of the second-order factor model. To complete the COSAN model specification, you can use MATRIX statements to specify the parameters and fixed values in the COSAN model matrices. See Example 29.29 for a complete example.

Special Cases of the Generalized COSAN Model

It is illustrative to see how you can view different types of models as a special case of the generalized COSAN model. This section describes two such special cases.

The Original COSAN Model

The original COSAN (covariance structure analysis) model (McDonald 1978, 1980) specifies the following covariance structures:

\[ \bSigma = \mb{F}_1 \cdots \mb{F}_ n \mb{P} \mb{F}_ n^{\prime } \cdots \mb{F}_1^{\prime } \]

This is the generalized COSAN with only one term for the covariance structure model formula. Hence, using the COSAN statement to specify the original COSAN model is straightforward.

Reticular Action Model

The RAM (McArdle 1980; McArdle and McDonald 1984) model fits the covariance structures

\[ \bSigma _ a = (\mb{I} - \mb{A})^{-1} \mb{P} (\mb{I} - \mb{A})^{-1 \prime } \]

where $\bSigma _ a$ is the symmetric covariance for all latent and observed variables in the RAM model, $\mb{A}$ is a square matrix for path coefficients, $\mb{I}$ is an identity matrix with the same dimensions as $\mb{A}$, and $\mb{P}$ is a symmetric covariance matrix. For details about the RAM model, see the section The RAM Model.

Correspondingly, the RAM model fits the mean structure formula

\[ \bmu _ a = (\mb{I} - \mb{A})^{-1} \mb{w} \]

where $\bmu _ a$ is the mean vector for all latent and observed variables in the RAM model and $\mb{w}$ is a vector for mean or intercepts of the variables.

To extract the covariance and mean structures for the observed variables, a selection matrix $\mb{G}$ is used. The selection matrix $\mb{G}$ contains zeros and ones as its elements. Each row of $\mb{G}$ has exactly one nonzero element at the position that corresponds to the location of a manifest row variable in $\bSigma _ a$ or $\bmu _ a$. The covariance structure formula for the observed variables in the RAM model becomes

\[ \bSigma = \mb{G}(\mb{I} - \mb{A})^{-1} \mb{P} (\mb{I} - \mb{A})^{-1 \prime } \mb{G}^{\prime } \]

The mean structure formula for the observed variables in the RAM model becomes

\[ \bmu = \mb{G}(\mb{I} - \mb{A})^{-1} \mb{w} \]

These formulas suggest that the RAM model is special case of the generalized COSAN model with one term. For example, suppose that there are 10 observed variables (var1var10) and 3 latent variables in a RAM model. The following COSAN statement represents the RAM model:

cosan var= v1-v10,
      G(13,GEN) * A(13,GEN,IMI) * P(13,SYM) [Mean = w];

In the COSAN statement, you define the 10 variables in the VAR= option. Next, you provide the formulas for the mean and covariance structures. $\mb{G}$ is $10 \times 13$ general matrix (GEN), $\mb{A}$ is a $13 \times 13$ general matrix with the IMI transformation (that is, $(\mb{I} - \mb{A})^{-1}$), $\mb{P}$ is a $13 \times 13$ symmetric matrix (SYM), and $\mb{w}$ is a $13 \times 1$ vector. With these COSAN statement specifications, your mean and covariance structure formulas represent exactly those of the RAM model. To complete the entire model specification, your next step is to use the MATRIX statements to specify the parameters and fixed values in the model matrices $\mb{G}$, $\mb{A}$, $\mb{P}$, and $\mb{w}$.

Similarly, it is possible to use the COSAN modeling language to represent any other model types such as models defined by the FACTOR, LINEQS, LISMOD, MSTRUCT, PATH, and RAM statements. But this is not an automatic recommendation of using the COSAN modeling languages in all situations. When an analysis can be specified by either the COSAN or a more specific modeling language (for example, PATH), you should consider using the specific modeling language because the specific modeling language can exploit specific model features so that it does the following:

  • enables more supplemental analysis (effect analysis, standardized solutions, and so on), which COSAN has no general way to display

  • supports better initial estimation methods (the COSAN model can only set initial estimates to certain default or random values)

  • leads to more efficient computations due to the availability of more specific formulas and algorithms

Certainly, the COSAN modeling language is still very useful when you fit some nonstandard model structures that cannot be handled otherwise by the more specific modeling languages.

Naming Variables in the COSAN Model

Although you can define the list of observed (manifest) variables in the VAR= option of the COSAN statement, the COSAN modeling language does not support a direct specification of the latent or error variables in the model. In the COSAN statement, you can define the model matrices and how they multiply together to form the covariance and mean structures. However, except for the row variables of the first matrix in each term, you do not need to identify the row and column variables in all other matrices. However, you can use the VARNAMES statement to label the column variables of the matrices. The names in the VARNAMES statement follow the general naming rules required by the general SAS system. They should not contain special characters and cannot be longer than 32 characters. Also, they do not need to use certain prefixes like what the LINEQS modeling language requires. It is important to realize that the VARNAME statement only labels, but does not identify, the column variables (and the row variables, by propagation). This means that while keeping all other things equal, changing the names in the VARNAMES statements does not change the mathematical model or the estimation of the model. For example, you can label all columns of a COSAN matrix with the same name but it does not mean that these columns refer to the same variable in the model. See the section Naming Variables and Parameters for the general rules about naming variables and parameters.

Default Parameters in the COSAN Model

The default parameters of the COSAN model matrices depend on the types of the matrices. Each element of the IDE or ZID matrix (identity matrix with or without an additional zero matrix) is either a fixed one or a fixed zero. You cannot override the default parameter values of these fixed matrices. For COSAN model matrices with types other than IDE or ZID, all elements are fixed zeros by default. You can override these default zeros by specifying them explicitly in the MATRIX statements.