Example 20.6 The Cigarette Sales Data: Dynamic Panel Estimation with GMM

In this example, a dynamic panel demand model for cigarette sales is estimated. It illustrates the application of the method described in the section Dynamic Panel Estimator. The data are a panel from 46 American states over the period 1963–92. See Baltagi and Levin (1992) and Baltagi (1995) for data description. All variables were transformed by taking the natural logarithm. The data set CIGAR is shown in the following statements.

data cigar;
    input state year price pop pop_16 cpi ndi sales pimin;
    label
    state  = 'State abbreviation'
    year   = 'YEAR'
    price  = 'Price per pack of cigarettes'
    pop    = 'Population'
    pop_16 = 'Population above the age of 16'
    cpi    = 'Consumer price index with (1983=100)'
    ndi    = 'Per capita disposable income'
    sales  = 'Cigarette sales in packs per capita'
    pimin  = 'Minimum price in adjoining states per pack of cigarettes';
    datalines;
1 63 28.6 3383 2236.5 30.6 1558.3045298 93.9 26.1
1 64 29.8 3431 2276.7 31.0 1684.0732025 95.4 27.5
1 65 29.8 3486 2327.5 31.5 1809.8418752 98.5 28.9
1 66 31.5 3524 2369.7 32.4 1915.1603572 96.4 29.5
1 67 31.6 3533 2393.7 33.4 2023.5463678 95.5 29.6
1 68 35.6 3522 2405.2 34.8 2202.4855362 88.4 32
1 69 36.6 3531 2411.9 36.7 2377.3346665 90.1 32.8
1 70 39.6 3444 2394.6 38.8 2591.0391591 89.8 34.3
1 71 42.7 3481 2443.5 40.5 2785.3159706 95.4 35.8

   ... more lines ...   

The following statements sort the data by STATE and YEAR variables.

 proc sort data=cigar;
    by state year;
 run;

Next, logarithms of the variables required for regression estimation are calculated, as shown in the following statements:

 data cigar;
    set cigar;
    lsales = log(sales);
    lprice = log(price);
    lndi = log(ndi);
    lpimin = log(pimin);
    label lprice = 'Log price per pack of cigarettes';
    label lndi = 'Log per capita disposable income';
    label lsales = 'Log cigarette sales in packs per capita';
    label lpimin = 'Log minimum price in adjoining states
                     per pack of cigarettes';
 run;

The following statements create the CIGAR_LAG data set with lagged variable for each cross section.

 proc panel data=cigar;
     id state year;
     clag lsales(1) / out=cigar_lag;
 run;
 data cigar_lag;
     set cigar_lag;
     label lsales_1 = 'Lagged log cigarette sales in packs per capita';
 run;

Finally, the model is estimated by a two step GMM method. Five lags (MAXBAND=5) of the dependent variable are used as instruments. NOLEVELS options is specified to avoid use of level equations, as shown in the following statements:

 proc panel data=cigar_lag;
     inst depvar;
     model lsales = lsales_1 lprice lndi lpimin
         / gmm nolevels twostep maxband=5 noint;
     id state year;
 run;

Output 20.6.1 Estimation with GMM
'

The PANEL Procedure
GMM: First Differences Transformation
 
Dependent Variable: lsales Log cigarette sales in packs per capita

Model Description
Estimation Method GMMTWO
Number of Cross Sections 46
Time Series Length 30
Estimate Stage 2
Maximum Number of Time Periods (MAXBAND) 5

Fit Statistics
SSE 2187.5988 DFE 1284
MSE 1.7037 Root MSE 1.3053

Parameter Estimates
Variable DF Estimate Standard Error t Value Pr > |t|
lsales_1 1 0.572219 0.00665 86.03 <.0001
lprice 1 -0.23464 0.0208 -11.29 <.0001
lndi 1 0.232673 0.00266 87.54 <.0001
lpimin 1 -0.08299 0.0223 -3.73 0.0002

If the theory suggests that there are other valid instruments, PREDETERMINED, EXOGENOUS and CORRELATED options can also be used.