The CALIS procedure uses a variety of modeling languages to fit structural equation models. This chapter provides documentation for all of them. Additionally, some sections provide introductions to the model specification, the theory behind the software, and other technical details. While some introductory material and examples are provided, this chapter is not a textbook for structural equation modeling and related topics. For didactic treatment of structural equation models with latent variables, see Bollen (1989b) and Loehlin (2004).
Reading this chapter sequentially is not a good strategy for learning about PROC CALIS. This section provides a guide or "road map" to the rest of the PROC CALIS chapter, starting with the basics and continuing through more advanced topics. Many sections assume that you already have a basic understanding of structural equation modeling.
The following table shows three different skill levels of using the CALIS procedure (basic , intermediate , and advanced ) and their milestones.
Level |
Milestone |
Starting Section |
---|---|---|
You are able to specify simple models, but might make mistakes. |
||
You are able to specify more sophisticated models with few syntactic and semantic mistakes. |
||
You are able to use the advanced options provided by PROC CALIS. |
In the next three sections, each skill level is discussed, followed by an introductory section of the reference topics that are not covered in any of the skill levels.
The section Overview: CALIS Procedure gives you an overall picture of the CALIS procedure but without the details.
The structural equation example in the section Getting Started: CALIS Procedure provides the starting point to learn the basic model specification. You learn how to represent your theory by using a path diagram and then translate the diagram into the PATH model for PROC CALIS to analyze. Because the PATH modeling language is new, this example is useful whether or not you have previous experience with PROC CALIS. The PATH model is specified in the section PATH Model. The corresponding results are shown and discussed in Example 29.17.
After you learn about the PATH modeling language and an example of its application, you can do either of the following:
You can continue to learn more modeling languages in the section Getting Started: CALIS Procedure.
You can skip to the section Syntax Overview for an overview of the PROC CALIS syntax and learn other modeling languages at a later time.
You do not need to learn all of the modeling languages in PROC CALIS. Any one of the modeling languages (LINEQS, LISMOD, PATH, or RAM) is sufficient for specifying a very wide class of structural equation models. PROC CALIS provides different kinds of modeling languages because different researchers might have previously learned different modeling languages or approaches. To get a general idea about different kinds of modeling languages, the following subsections in the Getting Started: CALIS Procedure section are useful:
LINEQS: Section LINEQS Model
RAM: Section RAM Model
LISMOD: Section LISMOD Model
FACTOR: Section A Factor Model Example
MSTRUCT: Section Direct Covariance Structures Analysis
After studying the examples in the Getting Started: CALIS Procedure section, you can strengthen your understanding of the various modeling languages by studying more examples such as those in section Examples: CALIS Procedure. Unlike the examples in the Getting Started: CALIS Procedure section, the examples in the Examples: CALIS Procedure section include the analysis results in addition to the explanations of the model specifications.
You can start with the following two sets of basic examples:
MSTRUCT model examples The basic MSTRUCT model examples demonstrate the testing of covariance structures directly on the covariance matrices. Although the MSTRUCT model is not the most common structural equation models in applications, these MSTRUCT examples can help you understand the basic form of covariance structures and the corresponding specifications in PROC CALIS.
PATH model examples The basic PATH model examples demonstrate how you can represent your model by path diagrams and by the PATH modeling language. These examples show the most common applications of structural equation modeling.
The following is a summary of the basic MSTRUCT model examples:
Example 29.1, Estimating Covariances and Correlations shows how you can estimate the covariances and correlations with standard error estimates for the variables in your model. The model you fit is a saturated covariance structure model.
Example 29.2, Estimating Covariances and Means Simultaneously extends Example 29.1 to include the mean structures in the model. The model you fit is a saturated mean and covariance structure model.
Example 29.3, Testing Uncorrelatedness of Variables shows a very basic covariance structure model, in which the covariance structures can be specified directly. The variables in this model are uncorrelated. You learn how to specify the covariance pattern directly.
Example 29.4, Testing Covariance Patterns extends Example 29.3 to include other covariance structures that you can specify directly.
Example 29.5, Testing Some Standard Covariance Pattern Hypotheses illustrates the use of built-in covariance patterns supported by PROC CALIS.
The following is a summary of the basic PATH model examples:
Example 29.6, Linear Regression Model shows how you can fit a linear regression model with the PATH modeling language of PROC CALIS. This example also introduces the path diagram representation of "causal" models. You compare results obtained from PROC CALIS and from the REG procedure, which is designed specifically for regression analysis.
Example 29.7, Multivariate Regression Models extends Example 29.6 in several different ways. You fit covariance structure models with more than one predictor, with direct and indirect effects. This example also discusses how you can choose the "best" model for your data.
Example 29.8, Measurement Error Models explores the case where the predictor in simple linear regression is measured with error. The concept of latent true score variable is introduced. You use PROC CALIS to fit a simple measurement error model.
Example 29.9, Testing Specific Measurement Error Models extends Example 29.8 to test special measurement error models with constraints. By using PROC CALIS, you can constrain your measurement error models in many different ways. For example, you can constrain the error variances or the intercepts to test specific hypotheses.
Example 29.10, Measurement Error Models with Multiple Predictors extends Example 29.8 to include more predictors in the measurement error models. The measurement errors in the predictors can be correlated in the model.
More elaborate examples about the MSTRUCT and PATH models are listed as follows:
Example 29.17, Path Analysis: Stability of Alienation shows you how to specify a simple PATH model and interpret the basic estimation results. The results are shown in considerable detail. The output and analyses include: a model summary, an initial model specification, an initial estimation method, an optimization history and results, residual analyses, residual graphics, estimation results, squared multiple correlations, and standardized results.
Example 29.19, Fitting Direct Covariance Structures shows you how to fit your covariance structures directly on the covariance matrix by using the MSTRUCT modeling language. You also learn how to use the FITINDEX statement to create a customized model fit summary and how to save the fit summary statistics into an external file.
Example 29.21, Testing Equality of Two Covariance Matrices Using a Multiple-Group Analysis uses the MSTRUCT modeling language to illustrate a simple multiple-group analysis. You also learn how to use the ODS SELECT statement to customize your printed output.
Example 29.22, Testing Equality of Covariance and Mean Matrices between Independent Groups uses the COVPATTERN= and MEANPATTERN= options to show some tests of equality of covariance and mean matrices between independent groups. It also illustrates how you can improve your model fit by the exploratory use of the Lagrange multiplier statistics for releasing equality constraints.
Example 29.24, Testing Competing Path Models for the Career Aspiration Data illustrates how you can fit competing models by using the OUTMODEL= and INMODEL= data sets for transferring and modifying model information from one analysis to another. This example also demonstrates how you can choose the best model among several competing models for the same data.
After studying the PATH and MSTRUCT modeling languages, you are able to specify most commonly used structural equation models by using PROC CALIS. To broaden your scope of structural equation modeling, you can study some basic examples that use the FACTOR and LINEQS modeling languages. These basic examples are listed as follows:
Example 29.11, Measurement Error Models Specified As Linear Equations explores another way to specify measurement error models in PROC CALIS. The LINEQS modeling language is introduced. You learn how to specify linear equations of the measurement error model by using the LINEQS statement. Unlike the PATH modeling language, in the LINEQS modeling language, you need to specify the error terms explicitly in the model specification.
Example 29.12, Confirmatory Factor Models introduces a basic confirmatory factor model for test items. You use the FACTOR modeling language to specify the factor-variable relationships.
Example 29.13, Confirmatory Factor Models: Some Variations extends Example 29.12 to include some variants of the confirmatory factor model. With the flexibility of the FACTOR modeling language, this example shows how you fit models with parallel items, tau-equivalent items, or partially parallel items.
More advanced examples that use the PATH, LINEQS, and FACTOR modeling languages are listed as follows:
Example 29.14, Residual Diagnostics and Robust Estimation illustrates the use of several graphical residual plots to detect model outliers and leverage observations, to study the departures from the theoretical case-level residual distribution, and to examine the linearity and homoscedasticity of variance. In addition, this example illustrates the use of robust estimation technique to downweight the outliers and to estimate the model parameters.
Example 29.15, The Full Information Maximum Likelihood Method shows how you can use the full information maximum likelihood (FIML) method to estimate your model when your data contain missing values. It illustrates the analysis of the data coverage of the sample variances, covariances, and means and the analysis of missing patterns and the mean profile. It also shows that the full information maximum likelihood method makes the maximum use of the available information from the data, as compared with the default ML (maximum likelihood) methods.
Example 29.16, Comparing the ML and FIML Estimation discusses the similarities and differences between the ML and FIML estimation methods as implemented in PROC CALIS. It uses an empirical example to show how ML and FIML obtain the same estimation results when the data do not contain missing values.
Example 29.18, Simultaneous Equations with Mean Structures and Reciprocal Paths is an econometric example that shows you how to specify models using the LINEQS modeling language. This example also illustrates the specification of reciprocal effects, the simultaneous analysis of the mean and covariance structures, the setting of bounds for parameters, and the definitions of metaparameters by using the PARAMETERS statement and SAS programming statements . You also learn how to shorten your output results by using some global display options such as the PSHORT and NOSTAND options in the PROC CALIS statement.
Example 29.20, Confirmatory Factor Analysis: Cognitive Abilities uses the FACTOR modeling language to illustrate confirmatory factor analysis. In addition, you use the MODIFICATION option in the PROC CALIS statement to compute LM test indices for model modifications.
Example 29.25, Fitting a Latent Growth Curve Model is an advanced example that illustrates the use of structural equation modeling techniques for fitting latent growth curve models. You learn how to specify random intercepts and random slopes by using the LINEQS modeling language. In addition to the modeling of the covariance structures, you also learn how to specify the mean structure parameters.
If you are familiar with the traditional Keesling-Wiley-Jöreskog measurement and structural models (Keesling 1972; Wiley 1973; Jöreskog 1973) or the RAM model (McArdle 1980), you can use the LISMOD or RAM modeling languages to specify structural equation models. The following example shows how to specify these types of models:
Example 29.23, Illustrating Various General Modeling Languages extends Example 29.17, which uses the PATH modeling language, and shows how to use the other general modeling languages: RAM, LINEQS, and LISMOD. These modeling languages enable you to specify the same path model as in Example 29.17 and get equivalent results. This example shows the connections between the general modeling languages supported in PROC CALIS. A good understanding of Example 29.17 is a prerequisite for this example.
Once you are familiar with various modeling languages, you might wonder which modeling language should be used in a given situation. The section Which Modeling Language? provides some guidelines and suggestions.
The section Syntax: CALIS Procedure shows the syntactic structure of PROC CALIS. However, reading the Syntax: CALIS Procedure section sequentially might not be a good strategy. The statements used in PROC CALIS are classified in the section Classes of Statements in PROC CALIS. Understanding this section is a prerequisite for understanding single-group and multiple-group analyses in PROC CALIS. Syntax for single-group analyses is described in the section Single-Group Analysis Syntax, and syntax for multiple-group analyses is described in the section Multiple-Group Multiple-Model Analysis Syntax.
You might also want to get an overview of the options in the PROC CALIS statement. However, you can skip the detailed listing of the available options in the PROC CALIS statement. Most of these details serve as references, so you can consult them only when you need to. You can just read the summary tables for the available options in the PROC CALIS statement in the following subsections:
Several subsections in the section Details: CALIS Procedure can help you gain a deeper understanding of the various types of modeling languages, as shown in the following table:
Language |
Section |
---|---|
COSAN |
|
FACTOR |
|
LINEQS |
|
LISMOD |
|
MSTRUCT |
|
PATH |
|
RAM |
The specification techniques you learn from the examples cover only parts of the modeling language. A more complete treatment of the modeling languages is covered in these subsections. In addition, you can also learn the mathematical models, model restrictions, and default parameterization of all supported modeling languages in these subsections. To get an overall idea about the default parameterization rules used in PROC CALIS, the section Default Analysis Type and Default Parameterization would be very useful. Understanding how PROC CALIS set default parameters would help you specify your models more efficiently and accurately.
At the intermediate level, you learn to minimize your mistakes in model specification and to establish more sophisticated modeling techniques. The following topics in the Details: CALIS Procedure section or elsewhere can help:
The section Naming Variables and Parameters summarizes the naming rules and conventions for variable and parameter names in specifying models.
The section Setting Constraints on Parameters covers various techniques of constraining parameters in model specifications.
The section Automatic Variable Selection discusses how PROC CALIS treats variables in the models and variables in the data sets. It also discusses situations where the VAR statement specification is deemed necessary.
The section Computational Problems discusses computational problems that occur quite commonly in structural equation modeling. It also discusses some possible remedies of the computational problem.
The section Missing Values and the Analysis of Missing Patterns describes the default treatment of missing values.
The statements REFMODEL and RENAMEPARM are useful when you need to make references to well-defined models when specifying a "new" model. See Example 29.28 for an application.
Revisit topics and examples covered at the basic level, as needed, to help you better understand the topics at the intermediate level.
You can also study the following more advanced examples:
Example 29.26, Higher-Order and Hierarchical Factor Models is an advanced example for confirmatory factor analysis. It involves the specifications of higher-order and hierarchical factor models. Because higher-order factor models cannot be specified by the FACTOR modeling language, you need to use the LINEQS model specification instead. A second-order factor model and a bifactor model are fit. Linear constraints on parameters are illustrated by using the PARAMETERS statement and SAS programming statements . Relationships between the second-order factor model and the bifactor model are numerically illustrated.
Example 29.27, Linear Relations among Factor Loadings is an advanced example of a first-order confirmatory factor analysis that uses the FACTOR modeling language. In this example, you learn how to use the PARAMETERS statement and SAS programming statements to set up dependent parameters in your model. You also learn how to specify the correlation structures for a specific confirmatory factor model.
Example 29.28, Multiple-Group Model for Purchasing Behavior is a sophisticated example of analyzing a path model. The PATH modeling language is used. In this example, a two-group analysis of mean and covariance structures is conducted. You learn how to use the REFMODEL statement to reference properly defined models and the SIMTESTS statement to test a priori simultaneous hypotheses.
Example 29.29, Fitting the RAM and EQS Models by the COSAN Modeling Language introduces the COSAN modeling language by connecting it with general RAM and EQS models. The model matrices of the RAM or EQS model are described. You specify these model matrices and the associated parameters in the COSAN modeling language.
Example 29.30, Second-Order Confirmatory Factor Analysis constructs the covariance structure model of the second-order confirmatory factor model. You define the model matrices by using the COSAN modeling language.
Example 29.31, Linear Relations among Factor Loadings: COSAN Model Specification shows how you can set linear constraints among model parameters under the COSAN model.
Example 29.32, Ordinal Relations among Factor Loadings shows how you can set ordinal constraints among model parameters under the COSAN model.
Example 29.33, Longitudinal Factor Analysis defines the covariance structures of a longitudinal factor model and shows how you can specify the covariance structure model with the COSAN modeling language.
At the advanced level, you learn to use the advanced data analysis and output control tools supported by PROC CALIS.
The following advanced data analysis topics are discussed:
Assessment of fit The section Assessment of Fit presents the fit indices used in PROC CALIS. However, the more important topics covered in this section are about how model fit indices are organized and used, how residuals can be used to gauge the fitting of individual parts of the model, and how the coefficients of determination are defined for equations. To customize your fit summary table, you can use the options on the FITINDEX statement.
Case-level residual diagnostics The section Case-Level Residuals, Outliers, Leverage Observations, and Residual Diagnostics describes details about how the residual diagnostics at the individual data level are accomplished in general structural equation modeling, and how they lead to the graphical techniques for detecting outliers and leverage observations, studying residual distributions, and examining linear relationships and heteroscedasticity of error variances.
Control and customization of path diagrams The section Path Diagrams: Layout Algorithms, Default Settings, and Customization discusses the path diagram layout algorithms that the CALIS procedure uses. It also illustrates useful options that control and customize path diagrams.
Effect partitioning The section Total, Direct, and Indirect Effects discusses the total, direct, and indirect effects and their computations. The stability coefficient of reciprocal causation is also defined. To customize the effect analysis, you can use the EFFPART statement.
Counting and adjusting degrees of freedom The section Counting the Degrees of Freedom describes how PROC CALIS computes model fit degrees of freedom and how you can use some options on the PROC CALIS statement to make degrees-of-freedom adjustments. To adjust the model fit degrees of freedom, you can use the DFREDUCE= and NOADJDF options in the PROC CALIS statement.
Standardized solutions Standardization schemes used in PROC CALIS are described and discussed in the section Standardized Solutions. Standardized solutions are displayed by default. You can turn them off by using the NOSTAND option of the PROC CALIS statement.
Model modifications In the section Modification Indices, modification indices such as Lagrange multiplier test indices and Wald statistics are defined and discussed. These indices can be used either to enhance your model fit or to make your model more precise. To limit the modification process only to those parameters of interest, you can use the LMTESTS statement to customize the sets of LM tests conducted on potential parameters.
A Priori Parametric Function Testing You can use the TESTFUNC statement to test a priori hypotheses individually. You can use the SIMTESTS statement to test a priori hypotheses simultaneously.
To be more effective in presenting your analysis results, you need to be more sophisticated in controlling your output. Some customization tools have been discussed in the previous section Advanced Data Analysis Tools and might have been mentioned in the examples included in the basic and the intermediate levels. In the following topics, these output control tools are presented in a more organized way so that you can have a systematic study scheme of these tools.
Global output control tools in PROC CALIS You can control output displays in PROC CALIS either by the global display options or by the individual output printing options. Each global display option typically controls more than one output display, while each individual output display controls only one output display. The global display options can both enable and suppress output displays, and they can also alter the format of the output. See the ALL , PRINT , PSHORT , PSUMMARY , and NOPRINT options for ways to control the appearances of the output. See the section Global Display Options for details about the global display options and their relationships with the individual output display options. Also see the ORDERALL , ORDERGROUPS , ORDERMODELS , ORDERSPEC , PARMNAME , PRIMAT , NOORDERSPEC , NOPARMNAME , NOSTAND , and NOSE options which control the output formats.
Customized analysis tools in PROC CALIS Many individual output displays in PROC CALIS can be customized via specific options or statements. If you do not use these customization tools, the default output will usually contain a large number of displays or displays with very large dimensions. These customized analysis tools are as follows:
The ON=, OFF=, ON(ONLY)= options in the FITINDEX statement enable you to select individual or groups of model fit indices or modeling information to display. You can still save the information of all fit indices in an external file by using the OUTFIT= option.
The EFFPART statement enables you to customize the effect analysis. You display only those effects of substantive interest.
The LMTESTS statement enables you to customize the sets of LM tests of interest. You test only those potential parameters that are theoretically and substantively possible.
Output selection and destinations by the ODS system This kind of output control is used not only for PROC CALIS, but is used
for all procedures that support the ODS system. The most common uses include output selection and output destinations assignment.
You use the ODS SELECT statement together with the ODS table names or graph names to select particular output displays. See
the section ODS Table Names for these names in PROC CALIS. The default output destination of PROC CALIS is the listing destination. You can add or change
the destinations by using statements such as ods html
(for html output), ods rtf
(for rich text output), and so on. For details, see Chapter 20: Using the Output Delivery System.
Some topics in the Details: CALIS Procedure section are intended primarily for references—you consult them only when you encounter specific problems in the PROC CALIS modeling or when you need to know the very fine technical details in certain special situations. Many of these reference topics in the Details: CALIS Procedure section are not required for practical applications of structural equation modeling. The following technical topics are discussed:
Measures of multivariate kurtosis and skewness This is covered in the section Measures of Multivariate Kurtosis.
Estimation criteria and the mathematical functions for estimation The section Estimation Criteria presents formulas for various estimation criteria. The relationships among these criteria are shown in the section Relationships among Estimation Criteria. To optimize an estimation criterion, you usually need its gradient and Hessian functions. These functions are detailed in the section Gradient, Hessian, Information Matrix, and Approximate Standard Errors, where you can also find information about the computation of the standard error estimates in PROC CALIS. Unlike other estimation methods, the robust estimation methods do not optimize a discrepancy function themselves. The robust estimation methods that are implemented in PROC CALIS use the iteratively reweighted least squares (IRLS) method to obtain parameter convergence. The robust estimation technique is detailed in the section Robust Estimation.
Initial estimation Initial estimates are necessary for all kinds of iterative optimization techniques. They are described in section Initial Estimates.
Use of optimization techniques Optimization techniques are covered in section Use of Optimization Techniques. See this section if you need to fine-tune the optimization.
Output displays and control The output displays in PROC CALIS are listed in the section Displayed Output. General requirements for the displays are also shown. With the ODS system, each table and graph has a name, which can be used on the ODS OUTPUT or ODS SELECT statement. See the section ODS Table Names for the ODS table and graph names.
Input and output files PROC CALIS supports several input and output data files for data, model information, weight matrices, estimates, fit indices, and estimation and descriptive statistics. The uses and the structures of these input and output data files are described in the sections Input Data Sets and Output Data Sets.