This example illustrates the use of the HPQLIM procedure. The data were originally published by Mroz (1987), and the following statements show a subset of the Mroz (1987) data set:
title1 'Estimating a Tobit model';
data subset;
input Hours Yrs_Ed Yrs_Exp @@;
if Hours eq 0 then Lower=.;
else Lower=Hours;
datalines;
0 8 9 0 8 12 0 9 10 0 10 15 0 11 4 0 11 6
1000 12 1 1960 12 29 0 13 3 2100 13 36
3686 14 11 1920 14 38 0 15 14 1728 16 3
1568 16 19 1316 17 7 0 17 15
;
In these data, Hours is the number of hours that the wife worked outside the household in a given year, Yrs_Ed is the years of education, and Yrs_Exp is the years of work experience.
By the nature of the data it is clear that there are a number of women who committed some positive number of hours to outside
work (
is observed). There are also a number of women who did not work outside the home at all (
is observed). This yields the following model:
![\[ y^{*}_{i} = \mathbf{x}_{i}’\bbeta + \epsilon _{i} \]](images/etsug_hpqlim0003.png)
![\[ y_{i} = \left\{ \begin{array}{ll} y^{*}_{i} & \mr{if} y^{*}_{i}>0 \\ 0 & \mr{if} y^{*}_{i}\leq 0 \end{array} \right. \]](images/etsug_hpqlim0004.png)
where
and the set of explanatory variables is denoted by
. The following statements fit a Tobit model to the hours worked with years of education and years of work experience as covariates:
/*-- Tobit Model --*/ proc hpqlim data=subset; model hours = yrs_ed yrs_exp; endogenous hours ~ censored(lb=0); performance nthreads=2 nodes=4 details; run;
The output of the HPQLIM procedure is shown in Figure 22.1.
Figure 22.1: Tobit Analysis Results
| Estimating a Tobit model |
| Model Fit Summary | |
|---|---|
| Number of Endogenous Variables | 1 |
| Endogenous Variable | Hours |
| Number of Observations | 17 |
| Log Likelihood | -74.93700 |
| Maximum Absolute Gradient | 1.18953E-6 |
| Number of Iterations | 23 |
| Optimization Method | Quasi-Newton |
| AIC | 157.87400 |
| Schwarz Criterion | 161.20685 |
The “Parameter Estimates” table contains four rows. The first three rows correspond to the vector estimate of the regression
coefficients
. The last row is called _Sigma, which corresponds to the estimate of the error variance
.