The CATMOD Procedure

Computational Formulas

The following formulas are shown for each population and for all populations combined.

	Source	Formula	Dimension
Probability Estimates
	$\text{[math]}$ th response	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$ th population	$\text{[math]}$	$\text{[math]}$
	all populations	$\text{[math]}$	$\text{[math]}$
Variance of Probability Estimates
	$\text{[math]}$ th population	$\text{[math]}$	$\text{[math]}$
	all populations	$\text{[math]}$	$\text{[math]}$
Response Functions
	$\text{[math]}$ th population	$\text{[math]}$	$\text{[math]}$
	all populations	$\text{[math]}$	$\text{[math]}$
Derivative of Function with Respect to Probability Estimates
	$\text{[math]}$ th population	$\text{[math]}$	$\text{[math]}$
	all populations	$\text{[math]}$	$\text{[math]}$
Variance of Functions
	$\text{[math]}$ th population	$\text{[math]}$	$\text{[math]}$
	all populations	$\text{[math]}$	$\text{[math]}$
Inverse Variance of Functions
	$\text{[math]}$ th population	$\text{[math]}$	$\text{[math]}$
	all populations	$\text{[math]}$	$\text{[math]}$

Derivative Table for Compound Functions: Y=F(G(p))

In the following table, let $\text{[math]}$ be a vector of functions of $\text{[math]}$ , and let $\text{[math]}$ denote $\text{[math]}$ , which is the first derivative matrix of $\text{[math]}$ with respect to $\text{[math]}$ :

Function	$\text{[math]}$	Derivative $\text{[math]}$
Multiply matrix	$\text{[math]}$	$\text{[math]}$
Logarithm	$\text{[math]}$	$\text{[math]}$
Exponential	$\text{[math]}$	$\text{[math]}$
Add constant	$\text{[math]}$	$\text{[math]}$

Default Response Functions: Generalized Logits

In the following table, subscripts $\text{[math]}$ for the population are suppressed. Also denote $\text{[math]}$ for $\text{[math]}$ for each population $\text{[math]}$ .

	Formula
Inverse of Response Functions for a Population
	$\text{[math]}$
Form of F and Derivative for a Population
	$\text{[math]}$
Covariance Results for a Population
	$\text{[math]}$

The following calculations are shown for each population and then for all populations combined:

	Source	Formula	Dimension
Design Matrix
	$\text{[math]}$ th population	$\text{[math]}$	$\text{[math]}$
	all populations	$\text{[math]}$	$\text{[math]}$
Crossproduct of Design Matrix
	$\text{[math]}$ th population	$\text{[math]}$	$\text{[math]}$
	all populations	$\text{[math]}$	$\text{[math]}$

In the following table, $\text{[math]}$ is the 100 $\text{[math]}$ th percentile of the standard normal distribution:

	Formula	Dimension
Crossproduct of Design Matrix with Function
	$\text{[math]}$	$\text{[math]}$
Weighted Least Squares Estimates
	$\text{[math]}$	$\text{[math]}$
Covariance of Weighted Least Squares Estimates
	$\text{[math]}$	$\text{[math]}$
Wald Confidence Limits for Parameter Estimates
	$\text{[math]}$	$\text{[math]}$
Predicted Response Functions
	$\text{[math]}$	$\text{[math]}$
Covariance of Predicted Response Functions
	$\text{[math]}$	$\text{[math]}$
Residual Chi-Square
	RSS $\text{[math]}$	$\text{[math]}$
Chi-Square for $\text{[math]}$
	Q $\text{[math]}$	$\text{[math]}$

Maximum Likelihood Method

Let $\text{[math]}$ be the Hessian matrix and $\text{[math]}$ be the gradient of the log-likelihood function (both functions of $\text{[math]}$ and the parameters $\text{[math]}$ ). Let $\text{[math]}$ denote the vector containing the first $\text{[math]}$ sample proportions from population $\text{[math]}$ , and let $\text{[math]}$ denote the corresponding vector of probability estimates from the current iteration. Starting with the least squares estimates $\text{[math]}$ of $\text{[math]}$ (if you use the ML and WLS options; with the ML option alone, the procedure starts with $\text{[math]}$ ), the probabilities $\text{[math]}$ are computed, and $\text{[math]}$ is calculated iteratively by the Newton-Raphson method until it converges (see the EPSILON= option). The factor $\text{[math]}$ is a step-halving factor that equals one at the start of each iteration. For any iteration in which the likelihood decreases, PROC CATMOD uses a series of subiterations in which $\text{[math]}$ is iteratively divided by two. The subiterations continue until the likelihood is greater than that of the previous iteration. If the likelihood has not reached that point after 10 subiterations, then convergence is assumed, and a warning message is displayed.

Sometimes, infinite parameters are present in the model, either because of the presence of one or more zero frequencies or because of a poorly specified model with collinearity among the estimates. If an estimate is tending toward infinity, then PROC CATMOD flags the parameter as infinite and holds the estimate fixed in subsequent iterations. PROC CATMOD regards a parameter to be infinite when two conditions apply:

The absolute value of its estimate exceeds five divided by the range of the corresponding variable.
The standard error of its estimate is at least three times greater than the estimate itself.

The estimator of the asymptotic covariance matrix of the maximum likelihood predicted probabilities is given by Imrey, Koch, and Stokes (1981, eq. 2.18).

The following equations summarize the method:

$\text{[math]}$

where

$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

Iterative Proportional Fitting

The algorithm used by PROC CATMOD for iterative proportional fitting is described in Bishop, Fienberg, and Holland (1975), Haberman (1972), and Agresti (2002). To illustrate the method, consider the observed three-dimensional table $\text{[math]}$ for the variables X, Y, and Z, and the following hierarchical model:

$\text{[math]}$

The following statements request that PROC CATMOD use IPF to fit the preceding model:

model X*Y*Z = _response_ / ml=ipf;
loglin X|Y|Z@2;

Begin with a table of initial cell estimates $\text{[math]}$ . PROC CATMOD produces the initial estimates by setting the $\text{[math]}$ structural zero cells to 0 and all other cells to $\text{[math]}$ , where $\text{[math]}$ is the total weight of the table and $\text{[math]}$ is the total number of cells in the table. Iteratively adjust the estimates at step $\text{[math]}$ to the observed marginal tables specified in the model by cycling through the following three-stage process to produce the estimates at step $\text{[math]}$ :

$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

The subscript " $\text{[math]}$ " indicates summation over the missing subscript. The log-likelihood $\text{[math]}$ is estimated at each step $\text{[math]}$ by

$\text{[math]}$

When the function $\text{[math]}$ is less than $\text{[math]}$ , the iterations terminate. You can change the comparison value with the EPSILON= option, and you can change the convergence criterion with the CONVCRIT= option. The option CONVCRIT=CELL uses the maximum cell difference

$\text{[math]}$

as the criterion while the option CONVCRIT=MARGIN computes the maximum difference of the margins

$\text{[math]}$