Heckman Selection Model Task :: SAS(R) Studio 3.4: User's Guide

About the Heckman Selection Model Task

The Heckman two-step selection method provides a means of correcting for non-randomly selected samples. It is a two-stage estimation method. The first stage performs a probit analysis on a selection equation. The second stage analyzes an outcome equation based on the first-stage binary probit model.

Note: This task is available only if you are running SAS 9.4 (or later), and your site licenses SAS/ETS 12.3 (or later).

Example: Heckman Selection Model Task

To create this example:

Create the Work.Mroz data set. For more information, see MROZ Data Set.
In the Tasks section, expand the Econometrics folder and double-click Heckman Selection Model. The user interface for the Heckman Selection Model task opens.
On the Data tab, select the WORK.MROZ data set.

Assign columns to these roles:

Role	Column Name
Selection Equation
Dependent variable	inlf
Continuous variables	nwifeinc exper expersq age kidslt6 kidsge6
Outcome Equation
Dependent variable	lwage
Continuous variables	exper expersq
Categorical variables	educ

To run the task, click .

Here is a subset of the results:

Example of Results from the Heckman Selection Model

Assigning Data to Roles

To run the Heckman Selection Model task, you must assign columns to the Dependent variable roles for the selection and outcome equations.

Role	Column Name
Selection Equation
Dependent variable	specifies a single numeric column that takes binary values. By default, the task uses samples where the dependent variable is equal to 1.
Continuous variables	specifies the independent columns (or regressors) to use in the model for the selection equation dependent variable.
Categorical variables	specifies how to group the values into levels.
Include the intercept	specifies whether to include the intercept in the selection equation.
Outcome Equation
Dependent variable	specifies a single numeric column to use.
Continuous variables	specifies the independent columns (or regressors) to use in the model for the outcome equation dependent variable.
Categorical values	specifies how to group the values into levels.
Include the intercept	specifies whether to include the intercept in the selection equation.

Setting Options

Option	Description
Methods
Variance estimation method	specifies whether to calculate the standard errors by using the corrected standard errors or the OLS standard errors.
Type of covariances of the parameter estimates	specifies the method to calculate the covariance matrix of parameter estimates. You can select the covariance from the outer product matrix, from the inverse Hessian matrix, or from the output product and Hessian matrices (the quasi-maximum likelihood estimates).
Optimization
Method	specifies the iterative minimization method to use. By default, the Quasi-Newton method is used.
Maximum number of iterations	specifies the maximum number of iterations for the selected method.
Statistics
You can specify whether the results include the statistics that the task creates by default, the default statistics and any additional statistics that you select, or no statistics. Here is the information that you can include in the results: correlation matrix of the parameter estimates covariance matrix of the parameter estimates iteration history of the objective function and parameter estimates