The HPQLIM Procedure

Getting Started: HPQLIM Procedure

This example illustrates the use of the HPQLIM procedure. The data were originally published by Mroz (1987), and the following statements show a subset of the Mroz (1987) data set:

title1 'Estimating a Tobit model';

data subset;
   input Hours Yrs_Ed Yrs_Exp @@;
   if Hours eq 0 then Lower=.;
      else               Lower=Hours;
datalines;
0 8 9 0 8 12 0 9 10 0 10 15 0 11 4 0 11 6
1000 12 1 1960 12 29 0 13 3 2100 13 36
3686 14 11 1920 14 38 0 15 14 1728 16 3
1568 16 19 1316 17 7 0 17 15
;

In these data, Hours is the number of hours that the wife worked outside the household in a given year, Yrs_Ed is the years of education, and Yrs_Exp is the years of work experience.

By the nature of the data it is clear that there are a number of women who committed some positive number of hours to outside work ($y_ i>0$ is observed). There are also a number of women who did not work outside the home at all ($y_ i=0$ is observed). This yields the following model:

\[  y^{*}_{i} = \mathbf{x}_{i}’\bbeta + \epsilon _{i}  \]
\[  y_{i} = \left\{  \begin{array}{ll} y^{*}_{i} &  \mr {if} y^{*}_{i}>0 \\ 0 &  \mr {if} y^{*}_{i}\leq 0 \end{array} \right.  \]

where $\epsilon _{i} \sim iid N(0,\sigma ^{2})$ and the set of explanatory variables is denoted by $\mathbf{x}_{i}$. The following statements fit a Tobit model to the hours worked with years of education and years of work experience as covariates:

/*-- Tobit Model --*/
proc hpqlim data=subset;
   model hours = yrs_ed yrs_exp;
   endogenous hours ~ censored(lb=0);
   performance nthreads=2 nodes=4 details;
run;

The output of the HPQLIM procedure is shown in Figure 4.1.

Figure 4.1: Tobit Analysis Results

Estimating a Tobit model

The HPQLIM Procedure

Model Fit Summary
Number of Endogenous Variables 1
Endogenous Variable Hours
Number of Observations 17
Log Likelihood -74.93700
Maximum Absolute Gradient 1.18953E-6
Number of Iterations 23
Optimization Method Quasi-Newton
AIC 157.87400
Schwarz Criterion 161.20685

Parameter Estimates
Parameter DF Estimate Standard Error t Value Approx
Pr > |t|
Intercept 1 -5598.295130 27.692220 -202.16 <.0001
Yrs_Ed 1 373.123254 53.988877 6.91 <.0001
Yrs_Exp 1 63.336247 36.551299 1.73 0.0831
_Sigma 1 1582.859635 390.076480 4.06 <.0001


The “Parameter Estimates” table contains four rows. The first three rows correspond to the vector estimate of the regression coefficients $\bbeta $. The last row is called _Sigma, which corresponds to the estimate of the error variance $\sigma $.