PATH Statement |
single-headed path for defining functional relationship
double-headed path for specifying variances or covariances
1-path for specifying means or intercepts
For example, the following PATH statement contains only the single-headed paths:
PATH V1 <--- V2, V2 <--- V4 V5, /* same as: V2 <--- V4 and V2 <--- V5 */ V3 ---> V5, /* same as: V5 <--- V3 */ V4 V5 <--- V6 V7; /* same as: V4 <--- V6, V4 <--- V7, V5 <--- V6, and V5 <--- V7 */
Although the most common definition of paths refer to these single-headed paths, PROC CALIS extends the definition of paths to include the so-called "variance-paths," "covariance-paths," and "1-paths" that refer to the variance, covariance, and the mean or intercept parameters, respectively. Corresponding to these extended path definitions, PROC CALIS provides the double-headed path and 1-path syntax. For example, the following PATH statement contains single-headed paths for specifying functional relationships and double-headed paths for specifying variances and covariances:
PATH V1 <--- V3-V5, /* same as: V1 <--- V3, V1 <--- V4, and V1 <--- V5 */ V2 <--- V4 V5, V3 <--- V5, V1 <--> V1, /* error variance of V1 */ <--> V2 V3, /* error variances of V2 and V3 */ V2 <--> V3, /* error covariance between V2 and V3 */ <--> [V4 V5]; /* variances and covariance for V4 and V5 */
The following PATH statement contains single-headed paths for specifying functional relationships and 1-paths for specifying means and intercepts:
PATH V1 <--- V3-V5, V2 <--- V4 V5, V3 <--- V5, 1 ---> V1, /* intercepts for V1 */ 1 ---> V2-V3, /* intercepts for V2 and V3 */ 1 ---> V4 V5; /* means of V4 and V5 */
Details about the syntax of these three different types of paths are described later. Instead of using double-headed paths and 1-paths, you can also specify these parameters by the subsidiary model specification statements such as the PVAR, PCOV, and the MEAN statements, as shown in the following syntactic structure of the PATH modeling language:
Typically, in this syntactic structure the paths contains only single-headed paths for representing the functional relationships among variables, which could be observed or latent. The paths are separated by commas. You can specify at most one PATH statement in a model within the scope of either the PROC CALIS statement or a MODEL statement.
Next, the PVAR statement specifies the parameters for the variances or error (partial) variances. The PCOV statement specifies the parameters for the covariances or error (partial) covariances. The MEAN statement specifies the parameters for the means or intercepts. For details about these subsidiary model specification statements, see the syntax of the individual statements.
A natural question now arises. For the specification of variances, covariances, intercepts, and means, should you use the extended path syntax that includes double-headed paths and 1-paths or the subsidiary model specification statements such as the PVAR, PCOV, and MEAN statements? If you want to specify all parameters in a single statement and hence output and view all the parameter estimates in a single output table, then the extended path syntax would be your choice. If you want to use more common language for specifying and viewing the parameters or the estimates of variances, covariances, means, and intercepts, then the subsidiary model specification statements serve the purpose better.
You are not restricted to using extended path syntax or the subsidiary model statements exclusively in a PATH model specification. For example, you might specify the variance of V1 by using the double-headed path syntax and the variance of V2 by using the PVAR statement. The only restriction is that you cannot specify the same parameter twice. In addition, even if you specify your PATH model without using double-headed paths or 1-paths, you can include the estimation results associated with these extended paths in the same output table for the single-headed paths by using the EXTENDPATH or GENPATH option. This way all the estimates of the PATH model can be shown in a single output table.
var_list arrow var_list2 < = parameter-spec >
where var_list and var_list2 are lists of variables, parameter-spec is an optional specification of parameters, and arrow represents either a left-arrow is one of the following forms:
<---, <--, <-, or <
or a right-arrow is one of the following forms:
--->, -->, ->, or >
In each single-headed path, you specify two lists of variables: var_list and var_list2. Depending on the direction of the arrow specification, one group of variables contains the outcome variables and the other group contains the predictor variables. Optionally, you can specify the parameter-spec at the end of each path entry. You can specify the following five types of the parameters for the path entries:
unnamed free parameters
initial values
fixed values
free parameters with names provided
free parameters with names and initial values provided
For example, in the following statement you specify a model with five paths:
PATH V1 <--- F1 , V2 <--- F1 = (0.5), V3 <--- F1 = 1., V4 <--- F1 = b1, V5 <--- F1 = b2 (.4);
The first path entry specifies a path from F1 to V1. The effect of F1 (or the path coefficient) on V1 is an unnamed free parameter. For this path effect parameter, PROC CALIS generates a parameter name with the _Parm prefix and appended with a unique integer (for example, _Parm1). The second path entry specifies a path from F1 to V2. The effect of F1 is also an unnamed free parameter with an initial estimate of 0.5. PROC CALIS also generates a parameter name for effect parameter. The third path entry specifies a path from F1 to V3. The effect of F1 is also a fixed value of 1.0. This value stays the same in the model estimation. The fourth path entry specifies a path from F1 to V4. The effect of F1 is a free parameter named b1. The fifth path entry specifies a path from F1 to V5. The effect of F1 is a free parameter named b2, with an initial value of 0.4.
You can specify multiple variables in the var_list and var_list2 lists. For example, the following statement specifies five paths from F1 to V1–V5:
PATH F1 ---> V1-V5;
All the five effects of F1 on the five variables are unnamed free parameters. If both var_list and var_list2 lists contain multiple variables, you must be careful about the order of the variables when you also specify parameters at the end of the path entry. For example, the following statement specifies the paths from the predictor variables x1–x2 to the outcome variables y1–y3:
PATH y1-y3 <--- x1-x2 = a1-a6;
The PATH statement specifies six paths in the path entry. These six paths have effect parameters a1–a6. This specification is equivalent to the following specification:
PATH y1 <--- x1 = a1; y1 <--- x2 = a2; y2 <--- x1 = a3; y2 <--- x2 = a4; y3 <--- x1 = a5; y3 <--- x2 = a6;
The following statement shows another example of multiple-path specification:
PATH x1-x2 ---> y1-y3 = b1-b6;
This specification is equivalent to the following specification with separate path specifications:
PATH x1 ---> y1 = b1; x1 ---> y2 = b2; x2 ---> y3 = b3; x2 ---> y1 = b4; x2 ---> y2 = b5; x2 ---> y3 = b6;
You can also specify parameter with mixed types in any path entry, as shown in the following specification:
PATH F1 ---> y1-y3 = 1. b1(.5) (.3), F2 ---> y4-y6 = 1. b2 b3(.7);
This specification is equivalent to the following expanded version:
PATH F1 ---> y1 = 1., F1 ---> y2 = b1(.5), F1 ---> y3 = (.3), F2 ---> y4 = 1., F2 ---> y5 = b2, F2 ---> y6 = b3(.7);
Notice that in the original specification with multiple-path entries, 0.5 is interpreted as the initial value for the parameter b1, but not as the initial estimate for the path from F1 to y3. In general, an initial value that follows a parameter name is associated with the free parameter.
If you indeed want to specify that b1 is a free parameter without an initial estimate and 0.5 is the initial estimate for the path from F1 to y3 (while keeping all other specification the same), you can use a null initial value specification, as shown in the following statement:
PATH F1 ---> y1-y3 = 1. b1() (.5) , F2 ---> y4-y6 = 1. b2 b3(.7);
This way 0.5 becomes the initial value for the path from F1 to y3. Because a parameter list with mixed types might be confusing, you can break down the specifications into separate path entries to remove ambiguities. For example, you can use the following specification equivalently:
PATH F1 ---> y1 = 1., F1 ---> y2 = b1, F1 ---> y3 = (.5) , F2 ---> y4-y6 = 1. b2 b3(.7);
The equal signs in the path entries are optional when the parameter lists do not start with a parameter name. For example, the preceding specification is the same as the following specification:
PATH F1 ---> y1 1., F1 ---> y2 = b1, F1 ---> y3 (.5) , F2 ---> y4-y6 1. b2 b3(.7);
Notice that in the second path entry, you must retain the equal sign because b1 is a parameter name. Omitting the equal sign makes the specification erroneous because b1 is treated as a variable. This might cause serious estimation problems. Omitting the equal signs might be cosmetically appealing in specifying fixed values or initial values (for example, the first and the third path entries). However, the gain of doing that is not much as compared to the clarity of specification that results from using the equal signs consistently.
Note: You do not need to specify single-headed paths from the errors or disturbances (that is, error terms) in the PATH model specification, even though the functional relationships between variables are not assumed to be perfect. Essentially, the roles of error terms in the PATH model are in effect represented by the associated default error variances of the endogenous variables, making it unnecessary to specify any single-headed paths from error or disturbance variables.
var_list two-headed-arrow var_list2 < = parameter-spec >
where a two-headed-arrow is one of the following forms:
<-->, <->, or <>
This syntax enables you to specify covariances between the variables in var_list and the variables in var_list2. Consider the following example:
PATH v1 <--> v2, v3 v4 <--> v5 v6 v7 = cv1-cv6;
The first double-headed path specifies the covariance between v1 and v2 as an unnamed free parameter. PROC CALIS generates a name for this unnamed free parameter. The second double-headed path specifies six covariances with parameters named cv1–cv6. This multiple-covariance specification is equivalent to the following elementwise covariance specification:
PATH v3 <--> v5 = cv1, v3 <--> v6 = cv2, v3 <--> v7 = cv3, v4 <--> v5 = cv4, v4 <--> v6 = cv5, v4 <--> v7 = cv6;
Note that the order of variables in the list is important for determining the assignment of the parameters in the parameter-spec list.
If the same variable appears in both of the var_list and var_list2 lists, the "covariance" specification becomes a variance specification for that variable. For example, the following statement specifies two variances:
PATH v1 <--> v1 = 1.0, v2 <--> v2 v3 = sigma2 cv23;
The first double-headed path entry specifies the variance of v1 as a fixed value of 1.0. The second double-headed path entry specifies the variance of v2 as a free parameter named sigma2, and then the covariance between v2 and v3 as a free parameter named cv23.
It results in an error if you attempt to use this syntax to specify the variance and covariances among a set of variables. For example, suppose you intend to specify the variances and covariances among v1–v3 as unnamed free parameters by the following statement:
PATH v1-v3 <--> v1-v3 ;
This specification expands to the following elementwise specification:
PATH v1 <--> v1 , v1 <--> v2 , v1 <--> v3 , v2 <--> v1 , v2 <--> v2 , v2 <--> v3 , v3 <--> v1 , v3 <--> v2 , v3 <--> v3 ;
There are nine variance or covariance specifications, but all of the covariances are specified twice. This is treated as a duplication error. The correct way is to specify only the nonredundant covariances, as shown in the following elementwise specification:
PATH v1 <--> v1 , v2 <--> v1 , v2 <--> v2 , v3 <--> v1 , v3 <--> v2 , v3 <--> v3 ;
However, the elementwise specification is quite tedious when the number of variables is large. Fortunately, there is another syntax for double-headed paths to deal with this situation. This syntax is discussed next.
two-headed-arrow var_list < = parameter-spec>
This syntax enables you to specify variances among the variables in var_list. Consider the following example:
PATH <--> v1 = (0.8), <--> v2-v4 ;
The first double-headed path entry specifies the variance of v1 as an unnamed free parameter with an initial estimate of 0.8. The second double-headed path entry specifies the variances of v2–v4 as unnamed free parameters. No initial values are given for these three variances. PROC CALIS generates names for all these variance parameters. You can specify these variances equivalently by the elementwise covariance specification syntax, as shown in the following, but former syntax is much more efficient.
PATH v1 <--> v1 = (0.8), v2 <--> v2 , v3 <--> v3 , v4 <--> v4 ;
two-headed-arrow [var_list] < = parameter-spec>
This syntax enables you to specify all the variances and covariances among the variables in var_list. For example,the following statement specifies all the variances and covariances among v2–v4:
PATH <--> [v2-v4] = 1.0 cv32 cv33(0.5) cv42 .7 cv44;
This specification is more efficient as compared with the following equivalent specification with elementwise variance or covariance definitions:
PATH v2 <--> v2 = 1.0, v3 <--> v2 = cv32 , v3 <--> v3 = cv33(0.5), v4 <--> v2 = cv42, v4 <--> v3 = .7, v4 <--> v2 = cv44;
two-headed-arrow (var_list) < = parameter-spec>
This syntax enables you to specify all the nonredundant covariances among the variables in var_list. For example, the following statement specifies all the nonredundant covariances between v2–v4:
PATH <--> (v2-v5) = cv1-cv6;
This specification is equivalent to the following elementwise specification:
PATH v3 <--> v2 = cv1 , v4 <--> v2 = cv2 , v4 <--> v3 = cv3 , v5 <--> v2 = cv4 , v5 <--> v3 = cv5 , v5 <--> v4 = cv6 ;
1 right-arrow var_list < = parameter-spec>
where a right-arrow is one of the following forms:
--->, -->, ->, or >
This syntax enables you to specify the means or intercepts of the variables in var_list as paths from the constant 1. Consider the following example:
PATH v1 <--- v2-v4, 1 ---> v1 = alpha, 1 ---> v2-v4 = 3*kappa;
The first single-headed path specifies that v1 is predicted by variables v2, v3, and v4. Next, the first 1-path entry specifies either the intercept of v1 as a free parameter named alpha. It is the intercept, rather than the mean, of v1 because endogenous in the PATH model. The second 1-path entry specifies the means of v2–v4 as constrained parameters. All these means or intercepts are named kappa so that they have the same estimate.
Therefore, whether the parameter is a mean or an intercept specified with the 1-path syntax depends on whether the associated variable is endogenous or exogenous in the model. The intercept is specified if the variable is endogenous. Otherwise, the mean of the variable is specified. Fortunately, any variable in the model can have either a mean or intercept (but not both) to specify. Therefore, the 1-path syntax is applicable to either the mean or intercept specification without causing conflicts.
If you provide fewer parameters in parameter-spec than the number of paths in a path entry, all the remaining parameters are treated as unnamed free parameters. For example, the following specification specifies the free parameter beta to the first path and assigns unnamed free parameters to the remaining four paths:
PATH F1 ---> y1 z1 z2 z3 z4 = beta;
This specification is equivalent to the following specification:
PATH F1 ---> y1 = beta, F1 ---> z1 z2 z3 z4;
If you intend to fill up all values with the last parameter specification in the list, you can use the continuation syntax [...], [..], or [.], as shown in the following example:
PATH F1 ---> y1 z1 z2 z3 z4 = beta gamma [...];
This specification is equivalent to the following specification:
PATH F1 ---> y1 z1 z2 z3 z4 = beta 4*gamma;
The repetition factor 4* means that gamma repeats 4 times.
However, you must be careful not to provide too many parameters. For example, the following specification results in an error:
PATH SES_Factor ---> y1 z1 z2 z3 z4 = beta gamma1-gamma6;
Because there are only five paths in the specification, parameters gamma5 and gamma6 are excessive.
It is important to understand the default parameters in the PATH model. First, knowing which parameters are default free parameters makes your specification more efficient by omitting the specifications of those parameters that can be set by default. For example, because all variances and covariances among exogenous variables (excluding error terms) are free parameters by default, you do not need to specify them in the PATH model if these variances and covariances are not constrained. Second, knowing which parameters are default fixed zero parameters enables you to specify your model accurately. For example, because all error covariances in the PATH model are fixed zeros by default, you must use the PCOV statement or the double-headed path syntax to specify the partial (error) covariances among the endogenous variables if you want to fit a model with correlated errors. See the section Default Parameters in the PATH Model for details about the default parameters of the PATH model.
If you define a new model by using a reference (old) model in the REFMODEL statement, you might want to modify some path specifications from the PATH statement of the reference model before transferring the specifications to the new model. To change a particular path specification from the reference model, you can simply respecify the same path with the desired parameter specification in the PATH statement of the new model. To delete a particular path and its associated parameter from the reference model, you can specify the desired path with a missing value specification in the PATH statement of the new model.
The new model is formed by integrating with the old model in the following ways:
If you do not specify in the new model a parameter location that exists in the old model, the old parameter specification is duplicated in the new model.
If you specify in the new model a parameter location that does not exist in the old model, the new parameter specification is used in the new model.
If you specify in the new model a parameter location that also exists in the old model and the new parameter is denoted by the missing value '.', the old parameter specification is not copied into the new model.
If you specify in the new model a parameter location that also exists in the old model and the new parameter is not denoted by the missing value '.', the new parameter specification replaces the old one in the new model.
For example, consider the following specification of a two-group analysis:
proc calis; group 1 / data=d1; group 2 / data=d2; model 1 / group=1; path V1 <--- F1 = 1., V2 <--- F1 = load1, V3 <--- F1 = load2, F1 <--- V4 = b1, F1 <--- V5 = b2, F1 <--- V6 = b3; pvar E1-E3 = ve1-ve3, F1 = vd1, V5-V6 = phi4-phi6; pcov V1 V2 = cve12; model 2 / group=2; refmodel 1; path V3 <--- F1 = load1, pcov V1 V2 = ., V2 V3 = cve23; run;
You specify Model 2 by referring to Model 1 in the REFMODEL statement. Model 2 is the new model that refers to the old model, Model 1. This example illustrates the four types of model integration rules for the new model:
Duplication: All parameter specifications, except for the partial covariance between V1 and V2 and the V3 <--- F1 path in the old model, are duplicated in the new model.
Addition: The parameter cve23 for the partial covariance between V2 and V3 is added in the new model because there is no corresponding specification in the old model.
Deletion: The specification of partial covariance between V1 and V2 in the old model is not copied into the new model, as indicated by the missing value '.' specified in the new model.
Replacement: The new path V3 <--- F1 replaces the same path in the old model with parameter load1 for the path coefficient. Thus, in the new model paths V3 <--- F1 and v2 <--- F1 are now constrained to have the same path coefficient parameter load1.