The HPQLIM Procedure

Getting Started: HPQLIM Procedure

This example illustrates the use of the HPQLIM procedure. The data were originally published by Mroz (1987), and the following statements show a subset of the Mroz (1987) data set:

title1 'Estimating a Tobit model';

data subset;
   input Hours Yrs_Ed Yrs_Exp @@;
   if Hours eq 0 then Lower=.;
      else               Lower=Hours;
datalines;
0 8 9 0 8 12 0 9 10 0 10 15 0 11 4 0 11 6
1000 12 1 1960 12 29 0 13 3 2100 13 36
3686 14 11 1920 14 38 0 15 14 1728 16 3
1568 16 19 1316 17 7 0 17 15
;

In these data, Hours is the number of hours that the wife worked outside the household in a given year, Yrs_Ed is the years of education, and Yrs_Exp is the years of work experience.

By the nature of the data it is clear that there are a number of women who committed some positive number of hours to outside work ( $y_ i>0$ is observed). There are also a number of women who did not work outside the home at all ( $y_ i=0$ is observed). This yields the following model:

$y^{*}_{i} = \mathbf{x}_{i}’\bbeta + \epsilon _{i}$

$y_{i} = \left\{ \begin{array}{ll} y^{*}_{i} & \mr {if} y^{*}_{i}>0 \\ 0 & \mr {if} y^{*}_{i}\leq 0 \end{array} \right.$

where $\epsilon _{i} \sim iid N(0,\sigma ^{2})$ and the set of explanatory variables is denoted by $\mathbf{x}_{i}$ . The following statements fit a Tobit model to the hours worked with years of education and years of work experience as covariates:

/*-- Tobit Model --*/
proc hpqlim data=subset;
   model hours = yrs_ed yrs_exp;
   endogenous hours ~ censored(lb=0);
   performance nthreads=2 nodes=4 details;
run;

The output of the HPQLIM procedure is shown in Figure 8.1.

Figure 8.1: Tobit Analysis Results

Estimating a Tobit model

The HPQLIM Procedure

Model Fit Summary
Number of Endogenous Variables	1
Endogenous Variable	Hours
Number of Observations	17
Log Likelihood	-74.93700
Maximum Absolute Gradient	1.18953E-6
Number of Iterations	23
Optimization Method	Quasi-Newton
AIC	157.87400
Schwarz Criterion	161.20685

Parameter Estimates
Parameter	DF	Estimate	Standard Error	t Value	Approx Pr > \|t\|
Intercept	1	-5598.295129	27.692220	-202.16	<.0001
Yrs_Ed	1	373.123254	53.988877	6.91	<.0001
Yrs_Exp	1	63.336247	36.551299	1.73	0.0831
_Sigma	1	1582.859635	390.076480	4.06	<.0001

The “Parameter Estimates” table contains four rows. The first three rows correspond to the vector estimate of the regression coefficients $\bbeta$ . The last row is called _Sigma, which corresponds to the estimate of the error variance $\sigma$ .