PROC TCALIS: The PATH Model :: SAS/STAT(R) 9.2 User's Guide, Second Edition

The TCALIS Procedure

The PATH Model

The PATH modeling language is supported in PROC TCALIS as a more intuitive modeling tool. It is designed so that specification by using the PATH modeling language translates effortlessly from the path diagram. For example, given the following simple path diagram:

you can use the PATH statement to specify the paths easily:

   path    A -> B   effect1,
           C -> B   effect2;

In the first entry of the PATH statement, the path A –> B is specified. The associated path coefficient or effect is named effect1. Similarly, in the second entry, the C –> B path is specified with effect2 as the associated effect parameter. In addition to the path coefficients or effects in the path diagram, you can also specify other types of parameters by using the PVAR and PCOV statements. See the section A Structural Equation Example for a more detailed example of the PATH model specification.

Despite its simple representation of the path diagram, the PATH modeling language is general enough to handle a wide class of structural models that can also be handled by other general modeling languages such as LINEQS, LISMOD, or RAM. For brevity, models specified by the PATH modeling language are called PATH models.

Types of Variables in the PATH Model

When you specify the paths in the PATH model, you typically use arrows (such as <– or –>) to denote "causal" paths. For example, in the preceding path diagram or the PATH statement, you specify that B is an outcome variable with predictors A and C, respectively in two paths. An outcome variable is the variable being pointed to in a path specification, while the predictor variable is the one where the arrow starts from.

Whereas the outcome–predictor relationship describes the roles of variables in each single path, the endogenous–exogenous relationship describes the roles of variables in the entire system of paths. In a system of path specification, a variable is endogenous if it is pointed to by at least one arrow or it serves as an outcome variable in a path at least once. Otherwise, it is exogenous. In the preceding path diagram, for example, variable B is endogenous and both variables A and C are exogenous. Note that although any variable that serves as an outcome variable at least in one path must be endogenous, it does not mean that all endogenous variables must serve only as outcome variables in all paths. An endogenous variable in a model might also serve as a predictor variable in a path. For example, variable B in the following PATH statement is an endogenous variable, and it serves as an outcome variable the first path but as a predictor variable in the second path.

   path    A -> B   effect1,
           B -> C   effect2;

A variable is a manifest or observed variable in the PATH model if it is measured and exists in the input data set. Otherwise, it is a latent variable. Because error variables are not explicitly defined in the PATH modeling language, all latent variables that are named in the PATH model are factors, which are considered to be the systematic source of effects in the model. Any manifest variable in the PATH model can be endogenous or exogenous. This same is true for any latent factor in the PATH model.

Because you do not name error variables in the PATH model, you do not need to specify paths from errors to any endogenous variables. Error terms are implicitly assumed for all endogenous variables in the PATH model. If error variables are not named, how can one specify the error variance parameters? In the PATH model, error variances are expressed equivalently as partial variances for the associated endogenous variables. These partial variances are set by default in the PATH modeling language. Therefore, you do not need to specify error variance parameters explicitly unless constraints on these parameters are required in the model. You can use the PVAR statement to specify the error variance or partial variance parameters explicitly.

Naming Variables in the PATH Model

Manifest variables in the PATH model are referenced in the input data set. Their names must not be longer than 32 characters. There is no further restrictions beyond those required by the SAS system. You use the names of manifest variables directly in the PATH model specification.

Because you do not name error variables in the PATH model, all latent variables named in the PATH model specification are factors (non-errors). Factor names in the PATH model must not be longer than 32 characters, and they should be different from the manifest variables. Unlike the LINEQS model, you do not need to use any specific prefix for the latent factor names in the PATH model.

As a general naming convention, you should not use Intercept as either a manifest or latent variable name.

Specification of the PATH Model

(1) Specification of Effects or Paths

You specify the "causal" paths or linear functional relationships among variables in the PATH statement. For example, in your model there is a path from v2 to v1 and the effect parameter is named parm1 with a starting value at $\text{[math]}$ , you can use either of these specifications:

path     v1 <-  v2   parm1  (0.5);

path     v2 ->  v1   parm1  (0.5);

If you have more than one path in your model, path specifications should be separated by commas, as shown in the following PATH statement:

   path 
      v1 <-  v2   parm1  (0.5),
      v2 <-  v3   parm2  (0.3);

Because the PATH statement can be used only once in each model specification, all paths in the model must be specified together in a single PATH statement. See the PATH statement for more details about the syntax.

(2) Specification of Variances and Partial (Error) Variances

If v2 is an exogenous variable in the PATH model and you want to specify its variance as a parameter named parm2 with a starting value at $\text{[math]}$ , you can use the following PVAR statement specification:

   pvar     v2  = parm2 (10.);

If v1 is an endogenous variable in the PATH model and you want to specify its partial variance or error variance as a parameter named parm3 with a starting value at $\text{[math]}$ , you can also use the following PVAR statement specification:

   pvar     v1 = parm3 (5.0);

Therefore, the PVAR statement can be used for both exogenous and endogenous variables. When a variable in the statement is exogenous (which can be automatically determined by PROC TCALIS), you are specifying the variance parameter of the variable. Otherwise, you are specifying the partial or error variance for an endogenous variable.

If you have more than one variance or partial variance parameters to specify in your model, you can put a variable list on the left-hand side of the equal sign, and a parameter list on the right-hand side, as shown in the following PVAR statement specification:

   pvar 
      v1 v2 v3 = parm1 (0.5) parm2 parm3;

In the specification, variance or partial variance parameters for variables v1–v3 are parm1, parm2, and parm3, respectively. Only parm1 is given an initial value at $\text{[math]}$ . The initial values for other parameters are generated by PROC TCALIS.

You can also separate the specifications into several entries in the PVAR statement. Entries should be separated by commas. For example, the preceding specification is equivalent to the following specification:

   pvar 
      v1 = parm1 (0.5),
      v2 = parm2,
      v3 = parm3;

Because the PVAR statement can be used only once in each model specification, all variance and partial variance parameters in the model must be specified together in a single PVAR statement. See the PVAR statement for more details about the syntax.

(3) Specification of Covariances and Partial Covariances

If you want to specify (partial) covariance between two variables v3 and v4 as a parameter named parm4 with a starting value at $\text{[math]}$ , you can use the following PCOV statement specification:

   pcov  v3  v4 = parm4 (5.);

Whether parm4 is a covariance or partial covariance parameter depends on the variables types of v3 and v4. If both v3 and v4 are exogenous variables (manifest or latent), parm4 is a covariance parameter between v3 and v4. If both v3 and v4 are endogenous variables (manifest or latent), parm4 is a parameter for the covariance between the errors for v3 and v4. In other words, it is a partial covariance or error covariance parameter for v3 and v4.

A less common case is when one of the variables is exogenous and the other is endogenous. In this case, parm4 is a parameter for the partial covariance between the endogenous variable and the exogenous variable, or the covariance between the error for the endogenous variable and the exogenous variable. Fortunately, such covariances are relatively uncommon in statistical modeling. Their uses confuse the roles of systematic and unsystematic sources in the model and lead to difficulties in interpretations. Therefore, you should almost always avoid this kind of partial covariances.

Like the syntax of the PVAR statement, you can specify a list of (partial) covariance parameters in the PCOV statement. For example, consider the following statement:

   pcov 
      v1 v2 = parm4,
      v1 v3 = parm5,
      v2 v3 = parm6;

In the specification, three (partial) covariance parameters parm4, parm5, and parm6 are specified, respectively, for the variable pairs (v1,v2), (v1,v3), and (v2,v3). Entries for (partial) covariance specification are separated by commas.

Because the PCOV statement can only be used once in each model specification, all covariance and partial covariance parameters in the model must be specified together in a single PCOV statement. See the PCOV statement for more details about the syntax.

(4) Specification of Means and Intercepts

Means and intercepts are specified when the mean structures of the model are of interest. You can specify mean and intercept parameters in the MEAN statement. For example, consider the following statement:

   mean     V5 = parm5  (11.);

If V5 is an exogenous variable (which is determined by PROC TCALIS automatically), you are specifying parm5 as the mean parameter of V5. If V5 is an endogenous variable, you are specifying parm5 as the intercept parameter for V5.

Because each named variable in the PATH model is either exogenous or endogenous (exclusively), each variable in the PATH model would have either a mean or an intercept parameter (but not both) to specify in the MEAN statement. Like the syntax of the PVAR statement, you can specify a list of mean or intercept parameters in the MEAN statement. For example, in the following statement you specify a list of mean or intercept parameters for variables v1-v4:

   mean
      v1-v4 = parm6-parm9;

This specification is equivalent to the following specification with four entries of parameter specifications:

   mean
      v1 = parm6,
      v2 = parm7,
      v3 = parm8,
      v4 = parm9;

Again, entries in the MEAN statement must be separated by commas, as shown in the preceding statement.

Because the MEAN statement can only be used once in each model specification, all mean and intercept parameters in the model must be specified together in a single MEAN statement. See the MEAN statement for more details about the syntax.

Specifying Parameters without Initial Values

If you do not have any knowledge about the initial value for a parameter, you can omit the initial value specification and let PROC TCALIS compute it. For example, you can just provide the parameter locations and parameter names as in the following specification:

   path    v1 <- v2   parm1;
      pvar v2 = parm2,
           v1 = parm3;

Specifying Fixed Parameter Values

If you want to specify a fixed parameter value, you do not need to provide a parameter name. Instead, you provide the fixed value (without parentheses) in the specification.

For example, the path coefficient for the path is fixed at $\text{[math]}$ and the (partial) variance of F1 is also fixed at $\text{[math]}$ .

   path    v1 <- F1  1.;
      pvar 
           F1 = 1.;

A Complete PATH Model Specification Example

To show a more complete PATH model specification, the RAM model example in the section A Complete RAM Model Specification Example is translated into the PATH model specification in the following statements:

   path     v1 <-  v2   parm1  (.5), 
            v1 <-  v3   parm2      ; 
      pvar  v1 = errv   (1.), 
            v2 = parm3  (10.), 
            v3 = parm4  (10.); 
      pcov  v3 v2 =  parm5  (5.);

The PATH model specification is very much the same as the RAM model specification. All paths specified in the RAM model are translated into specification in the PATH statement. All other specification are translated into the PVAR, PCOV, and MEAN statements.

Default Parameters in the PATH Model

The treatment of restricted and default parameters in the PATH model is essentially the same as that of the RAM model (see the section Default Parameters in the RAM Model).

The PATH model does not allow the specification of the effect of any variable on itself. In other words, it is invalid to specify the following:

   path     v1  <-  v1  parm;

The coefficient for such a path is always zero, meaning that the path should not exist in any PATH model. Other than this restriction, any other parameters supported by the PATH model can be specified in the PATH, PVAR, PCOV, and the MEAN statements. If a parameter location in the PATH model is not specified, a default parameter will be applied to that location. There are two types of default parameters: one is automatic free parameters, and the other is fixed zeros.

Automatic Free Parameters

The set of automatic free parameters in the PATH model is derived essentially in the same way as in the RAM or LINEQS model. That is, automatic free parameters of the PATH model include:

the variances or partial (or error) variances of all variables, manifest or latent. Consequently, all possible variance or partial variance parameters that can be specified in the PVAR statement are automatic free parameters unless explicitly specified otherwise.
the means of exogenous manifest variables when the mean structures are modeled. That is, all mean parameters pertaining to exogenous manifest variables that can be specified in the MEAN statement are automatic free parameters unless explicitly specified otherwise.
the covariances among all exogenous manifest latent variables. That is, all covariance parameters pertaining to all possible pairs of exogenous manifest variables that can be specified in the PCOV statement are automatic free parameters unless explicitly specified otherwise.

The reason for automatic parameter generation is to safeguard a proper PATH model specification. See the section Rationale of the Default Parameters in the LINEQS Model for a more detailed explanation.

An automatic parameter name is generated by PROC TCALIS for each of these automatic mean, intercept, variance, partial variance, covariance, and partial covariance parameters. Each automatic parameter name is prefixed with _Add and appended with a unique integer.

Default Fixed Zeros

All unspecified parameter locations that are neither set by model-restricted values nor generated with automatic free parameters in the PATH model are fixed zeros by default.

Relating the PATH Model to the RAM Model

Mathematically, the PATH model is essentially the RAM model. You can consider the PATH model to share exactly the same set of model matrices as in the RAM model. See the section Model Matrices in the RAM Model and the section Summary of Matrices and Submatrices in the RAM Model for details about the RAM model matrices. In the RAM model, the $\text{[math]}$ matrix contains effects or path coefficients for describing relationships among variables. In the PATH model, you specify these effect or coefficient parameters in the PATH statement. The $\text{[math]}$ matrix in the RAM model contains (partial) variance and (partial) covariance parameters. In the PATH model, you use the PVAR and PCOV statements to specify these parameters. The $\text{[math]}$ vector in the RAM model contains the mean and intercept parameters, while in the PATH model you use the MEAN statement to specify these parameters. By using these model matrices in the PATH model, the covariance and mean structures are derived in the same way as that of the RAM model. See the section The RAM Model for derivations of the model structures.

Because the mathematical model behind the PATH and the RAM modeling languages are essentially the same, it is no wonder that the PATH and the RAM syntax for model specification resemble to each other. That is, all path specifications in the PATH statement translate into the PATH list-entries of the RAM statement. All PVAR, PCOV, and MEAN specifications of the PATH model translate into the PVAR, PCOV, and MEAN (or INTERCEPT) list-entries, respectively, in the RAM statement.

Note: This procedure is experimental.

Top of Page