The QLIM Procedure

Selection Models

In sample selection models, one or several dependent variables are observed when another variable takes certain values. For example, the standard Heckman selection model can be defined as

$\text{[math]}$

where $\text{[math]}$ and $\text{[math]}$ are jointly normal with zero mean, standard deviations of 1 and $\text{[math]}$ , and correlation of $\text{[math]}$ . $\text{[math]}$ is the variable that the selection is based on, and $\text{[math]}$ is observed when $\text{[math]}$ has a value of 1. Least squares regression using the observed data of $\text{[math]}$ produces inconsistent estimates of $\text{[math]}$ . Maximum likelihood method is used to estimate selection models. It is also possible to estimate these models by using Heckman’s method, which is more computationally efficient. But it can be shown that the resulting estimates, although consistent, are not asymptotically efficient under normality assumption. Moreover, this method often violates the constraint on correlation coefficient $\text{[math]}$ .

The log-likelihood function of the Heckman selection model is written as

	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

Only one variable is allowed for the selection to be based on, but the selection may lead to several variables. For example, in the following switching regression model,

$\text{[math]}$

	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

$\text{[math]}$ is the variable that the selection is based on. If $\text{[math]}$ , then $\text{[math]}$ is observed. If $\text{[math]}$ , then $\text{[math]}$ is observed. Because it is never the case that $\text{[math]}$ and $\text{[math]}$ are observed at the same time, the correlation between $\text{[math]}$ and $\text{[math]}$ cannot be estimated. Only the correlation between $\text{[math]}$ and $\text{[math]}$ and the correlation between $\text{[math]}$ and $\text{[math]}$ can be estimated. This estimation uses the maximum likelihood method.

A brief example of the code for this model can be found in Sample Selection Model.

The Heckman selection model can include censoring or truncation. For a brief example of the code for these models see Sample Selection Model with Truncation and Censoring. The following example shows a variable $\text{[math]}$ that is censored from below at zero.

$\text{[math]}$

In this case, the log-likelihood function of the Heckman selection model needs to be modified to include the censored region.

$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

In case $\text{[math]}$ is truncated from below at zero instead of censored, the likelihood function can be written as

	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$