The TCALIS Procedure

Overview: TCALIS Procedure

Structural equation modeling is an important statistical tool in social and behavioral sciences. Structural equations express relationships among a system of variables that can be either observed variables (manifest variables) or unobserved hypothetical variables (latent variables). For an introduction to latent variable models, see Loehlin (2004), Bollen (1989b), Everitt (1984), or Long (1983); and for manifest variables with measurement errors, see Fuller (1987).

In structural models, as opposed to functional models, all variables are taken to be random rather than having fixed levels. For maximum likelihood (default) and generalized least squares estimation in PROC TCALIS, the random variables are assumed to have an approximately multivariate normal distribution. Nonnormality, especially high kurtosis, can produce poor estimates and grossly incorrect standard errors and hypothesis tests, even in large samples. Consequently, the assumption of normality is much more important than in models with nonstochastic exogenous variables. You should remove outliers and consider transformations of nonnormal variables before using PROC TCALIS with maximum likelihood (default) or generalized least squares estimation. If the number of observations is sufficiently large, Browne’s asymptotically distribution-free (ADF) estimation method can be used.

You can use the TCALIS procedure to estimate parameters and test hypotheses for constrained and unconstrained problems in various situations, including but not limited to the following:

exploratory and confirmatory factor analysis of any order
linear measurement-error models or regression with errors in variables
multiple and multivariate linear regression
multiple-group structural equation modeling with mean and covariance structures
path analysis and causal modeling
simultaneous equation models with reciprocal causation
structured covariance and mean matrices

To specify models in PROC TCALIS, you can use a variety of modeling languages:

FACTOR—supports the input of factor-variable relations
LINEQS—like the EQS program (Bentler 1995), uses equations to describe variable relationships
LISMOD—utilizes LISREL (Jöreskog and Sörbom 1985) model matrices for defining models
MSTRUCT—supports direct parameterization in the mean and covariance matrices
PATH—provides an intuitive causal path specification interface
RAM—utilizes the formulation of the reticular action model (McArdle and McDonald 1984)
REFMODEL—provides a quick way for model referencing and respecification

Various modeling languages are provided to suit a wide range of researchers’ background and modeling philosophy. However, statistical situations might arise when one modeling language is more convenient than the others. This will be discussed in the section Which Modeling Language?.

In addition to basic model specification, you can set various parameter constraints in PROC TCALIS. Equality constraints on parameters can be achieved by simply giving the same parameter names in different parts of the model. Boundary, linear, and nonlinear constraints are supported as well. If parameters in the model are dependent on additional parameters, you can define the dependence by using the PARAMETERS and the SAS programming statements.

Before the data are analyzed, researchers might be interested in studying some statistical properties of the data. PROC TCALIS can provide the following statistical summary of the data:

covariance and mean matrices and their properties
descriptive statistics like means, standard deviations, univariate skewness, and kurtosis measures
multivariate measures of kurtosis
weight matrix and its descriptive properties

After a model is fitted and accepted by the researcher, PROC TCALIS can provide the following supplementary statistical analysis:

computing squared multiple correlations and determination coefficients
direct and indirect effects partitioning with standard error estimates
model modification tests such as Lagrange multiplier and Wald tests
computing fit summary indices
computing predicted moments of the model
residual analysis
factor rotations
standardized solutions with standard errors
testing parametric functions, individually or simultaneously

When fitting a model, you need to choose an estimation method. The following estimation methods are supported in the TCALIS procedure:

diagonally weighted least squares (DWLS, with optional weight matrix input)
generalized least squares (GLS, with optional weight matrix input)
maximum likelihood (ML, for multivariate normal data); this is the default method
unweighted least squares (ULS)
weighted least squares or asymptotically distribution-free method (WLS or ADF, with optional weight matrix input)

Estimation methods implemented in PROC TCALIS do not exhaust all alternatives in the field. For example, partial least squares (PLS) is not implemented. See the section Estimation Criteria for details about estimation criteria used in PROC TCALIS. Note that there is a SAS/STAT procedure called PROC PLS, which employs the partial least squares technique but for a different class of models than those of PROC TCALIS. For general path analysis with latent variables, consider using PROC TCALIS.

All estimation methods need some starting values for the parameter estimates. You can provide starting values for any parameters. If there is any estimate without a starting value provided, PROC TCALIS determines the starting value by using one or any combination of the following methods:

approximate factor analysis
default initial values
instrumental variable method
matching observed moments of exogenous variables
McDonald’s (McDonald and Hartmann 1992) method
ordinary least squares estimation
random number generation, if a seed is provided
two-stage least squares estimation

Although no methods for initial estimates are completely foolproof, the initial estimation methods provided by PROC TCALIS behave reasonably well in most common applications.

With initial estimates, PROC TCALIS will iterate the solutions so as to achieve the optimum solution as defined by the estimation criterion. This is a process known as optimization. Because numerical problems can occur in any optimization process, the TCALIS procedure offers several optimization algorithms so that you can choose alternative algorithms when the one being used fails. The following optimization algorithms are supported in PROC TCALIS:

Levenberg-Marquardt algorithm (Moré, 1978)
trust-region algorithm (Gay 1983)
Newton-Raphson algorithm with line search
ridge-stabilized Newton-Raphson algorithm
various quasi-Newton and dual quasi-Newton algorithms: Broyden-Fletcher-Goldfarb-Shanno and Davidon-Fletcher-Powell, including a sequential quadratic programming algorithm for processing nonlinear equality and inequality constraints
various conjugate gradient algorithms: automatic restart algorithm of Powell (1977), Fletcher-Reeves, Polak-Ribiere, and conjugate descent algorithm of Fletcher (1980)

In addition to the ability to save output tables as data sets by using the ODS OUTPUT statement, PROC TCALIS supports the following types of output data sets so that you can save your analysis results for later use:

OUTEST= data sets for storing parameter estimates and their covariance estimates
OUTFIT= data sets for storing fit indices and some pertinent modeling information
OUTMODEL= data sets for storing model specifications and final estimates
OUTSTAT= data sets for storing descriptive statistics, residuals, predicted moments, and latent variable scores regression coefficients
OUTWGT= data sets for storing the weight matrices used in the modeling

The OUTEST=, OUTMODEL=, and OUTWGT= data sets can be used as input data sets for subsequent analyses. That is, in addition to the input data provided by the DATA= option, PROC TCALIS supports the following input data sets for various purposes in the analysis:

INEST= data sets for providing initial parameter estimates. An INEST= data set could be an OUTEST= data set created from a previous analysis.
INMODEL= data sets for providing model specifications and initial estimates. An INMODEL= data set could be an OUTMODEL= data set created from a previous analysis.
INWGT= data sets for providing the weight matrices. An INWGT= data set could be an OUTWGT= data set created from a previous analysis.

The TCALIS procedure uses ODS Graphics to create graphs as part of its output. High-quality residual histograms are available in PROC TCALIS. See Chapter 21, Statistical Graphics Using ODS, for general information about ODS Graphics. See the section ODS Graphics and the PLOTS= option for specific information about the statistical graphics available with the TCALIS procedure.

Note: This procedure is experimental.

Top of Page