Example 87.2 The Medical Expenditure Panel Survey (MEPS)

The U.S. Department of Health and Human Services conducts the Medical Expenditure Panel Survey (MEPS) to produce national and regional estimates of various aspects of health care. The MEPS has a complex sample design that includes both stratification and clustering. The sampling weights are adjusted for nonresponse and raked with respect to population control totals from the Current Population Survey. See the MEPS Survey Background (2006) and Machlin, Yu, and Zodet (2005) for details.

In this example, the 1999 full-year consolidated data file HC-038 (MEPS HC-038, 2002) from the MEPS is used to investigate the relationship between medical insurance coverage and the demographic variables. The data can be downloaded directly from the Agency for Healthcare Research and Quality (AHRQ) Web site at http://www.meps.ahrq.gov/mepsweb/data_stats/download_data_files_detail.jsp?cboPufNumber=HC-038 in either ASCII format or SAS transport format. The Web site includes a detailed description of the data as well as the SAS program used to access and format it.

For this example, the SAS transport format data file for HC-038 is downloaded to 'C:H38.ssp' on a Windows-based PC. The instructions on the Web site lead to the following SAS statements for creating a SAS data set MEPS, which contains only the sample design variables and other variables necessary for this analysis.

proc format; 
   value racex  
      -9 = 'NOT ASCERTAINED'
      -8 = 'DK'
      -7 = 'REFUSED'
      -1 = 'INAPPLICABLE'
      1 = 'AMERICAN INDIAN'
      2 = 'ALEUT, ESKIMO'
      3 = 'ASIAN OR PACIFIC ISLANDER'
      4 = 'BLACK'
      5 = 'WHITE'
      91 = 'OTHER'
      ; 
   value sex  
      -9 = 'NOT ASCERTAINED'
      -8 = 'DK'
      -7 = 'REFUSED'
      -1 = 'INAPPLICABLE'
      1 = 'MALE'
      2 = 'FEMALE'
      ;       
   value povcat9h  
      1 = 'NEGATIVE OR POOR' 
      2 = 'NEAR POOR'
      3 = 'LOW INCOME'
      4 = 'MIDDLE INCOME'
      5 = 'HIGH INCOME'
      ;
   value inscov9f  
      1 = 'ANY PRIVATE'
      2 = 'PUBLIC ONLY'
      3 = 'UNINSURED'
      ;
run;
libname mylib '';
filename in1 'H38.SSP';     
proc xcopy in=in1 out=mylib import;
run;
data meps; 
   set mylib.H38;
   label racex= sex= inscov99= povcat99=
      varstr99= varpsu99= perwt99f= totexp99=;
   format racex racex. sex sex.
      povcat99 povcat9h. inscov99 inscov9f.;
   keep inscov99 sex racex povcat99 varstr99 
      varpsu99 perwt99f totexp99;
run;

There are a total of 24,618 observations in this SAS data set. Each observation corresponds to a person in the survey. The stratification variable is VARSTR99, which identifies the 143 strata in the sample. The variable VARPSU99 identifies the 460 PSUs in the sample. The sampling weights are stored in the variable PERWT99F. The response variable is the health insurance coverage indicator variable, INSCOV99, which has three values:

The person had any private insurance coverage any time during 1999

The person had only public insurance coverage during 1999

The person was uninsured during all of 1999

The demographic variables include gender (SEX), race (RACEX), and family income level as a percent of the poverty line (POVCAT99). The variable RACEX has five categories:

American Indian

Aleut, Eskimo

Asian or Pacific Islander

Black

White

The variable POVCAT99 is constructed by dividing family income by the applicable poverty line (based on family size and composition), with the resulting percentages grouped into five categories:

Negative or poor (less than 100%)

Near poor (100% to less than 125%)

Low income (125% to less than 200%)

Middle income (200% to less than 400%)

High income (greater than or equal to 400%)

The data set also contains the total health care expenditure in 1999, TOTEXP99, which is used as a covariate in the analysis.

Output 87.2.1 displays the first 30 observations of this data set.

Output 87.2.1 1999 Full-Year MEPS (First 30 Observations)
Obs SEX RACEX POVCAT99 INSCOV99 TOTEXP99 PERWT99F VARSTR99 VARPSU99
1 MALE
WHITE

MIDDLE INCOME PUBLIC ONLY 2735 14137.86 131 2
2 FEMALE
WHITE

MIDDLE INCOME ANY PRIVATE 6687 17050.99 131 2
3 MALE
WHITE

MIDDLE INCOME ANY PRIVATE 60 35737.55 131 2
4 MALE
WHITE

MIDDLE INCOME ANY PRIVATE 60 35862.67 131 2
5 FEMALE
WHITE

MIDDLE INCOME ANY PRIVATE 786 19407.11 131 2
6 MALE
WHITE

MIDDLE INCOME ANY PRIVATE 345 18499.83 131 2
7 MALE
WHITE

MIDDLE INCOME ANY PRIVATE 680 18499.83 131 2
8 MALE
WHITE

MIDDLE INCOME ANY PRIVATE 3226 22394.53 136 1
9 FEMALE
WHITE

MIDDLE INCOME ANY PRIVATE 2852 27008.96 136 1
10 MALE
WHITE

MIDDLE INCOME ANY PRIVATE 112 25108.71 136 1
11 MALE
WHITE

MIDDLE INCOME ANY PRIVATE 3179 17569.81 136 1
12 MALE
WHITE

MIDDLE INCOME ANY PRIVATE 168 21478.06 136 1
13 FEMALE
WHITE

MIDDLE INCOME ANY PRIVATE 1066 21415.68 136 1
14 MALE
WHITE

NEGATIVE OR POOR PUBLIC ONLY 0 12254.66 125 1
15 MALE
WHITE

NEGATIVE OR POOR ANY PRIVATE 0 17699.75 125 1
16 FEMALE
WHITE

NEGATIVE OR POOR UNINSURED 0 18083.15 125 1
17 MALE
BLACK

NEGATIVE OR POOR PUBLIC ONLY 230 6537.97 78 10
18 MALE
WHITE

LOW INCOME UNINSURED 408 8951.36 95 2
19 FEMALE
WHITE

LOW INCOME UNINSURED 0 11833.00 95 2
20 MALE
WHITE

LOW INCOME UNINSURED 40 12754.07 95 2
21 FEMALE
WHITE

LOW INCOME UNINSURED 51 14698.57 95 2
22 MALE
WHITE

LOW INCOME UNINSURED 0 3890.20 92 19
23 FEMALE
WHITE

LOW INCOME UNINSURED 610 5882.29 92 19
24 MALE
WHITE

LOW INCOME PUBLIC ONLY 24 8610.47 92 19
25 FEMALE
BLACK

MIDDLE INCOME UNINSURED 1758 0.00 64 1
26 MALE
BLACK

MIDDLE INCOME PUBLIC ONLY 551 7049.70 64 1
27 MALE
BLACK

MIDDLE INCOME ANY PRIVATE 65 34067.03 64 1
28 FEMALE
BLACK

NEGATIVE OR POOR PUBLIC ONLY 0 9313.84 73 12
29 FEMALE
BLACK

NEGATIVE OR POOR PUBLIC ONLY 10 14697.03 73 12
30 MALE
BLACK

NEGATIVE OR POOR PUBLIC ONLY 0 4574.73 73 12

The following SAS statements fit a generalized logit model for the 1999 full-year consolidated MEPS data:

proc surveylogistic data=meps; 
   stratum VARSTR99;
   cluster VARPSU99;
   weight PERWT99F;
   class SEX RACEX POVCAT99;
   model INSCOV99 = TOTEXP99 SEX RACEX POVCAT99 / link=glogit;
run;    

The STRATUM statement specifies the stratification variable VARSTR99. The CLUSTER statement specifies the PSU variable VARPSU99. The WEIGHT statement specifies the sample weight variable PERWT99F. The demographic variables SEX, RACEX, and POVCAT99 are listed in the CLASS statement to indicate that they are categorical independent variables in the MODEL statement. In the MODEL statement, the response variable is INSCOV99, and the independent variables are TOTEXP99 along with the selected demographic variables. The LINK= option requests that the procedure fit the generalized logit model because the response variable INSCOV99 has nominal responses.

The results of this analysis are shown in the following outputs.

PROC SURVEYLOGISTIC  lists the model fitting information and sample design information in Output 87.2.2.

Output 87.2.2 MEPS, Model Information
The SURVEYLOGISTIC Procedure

Model Information
Data Set WORK.MEPS
Response Variable INSCOV99
Number of Response Levels 3
Stratum Variable VARSTR99
Number of Strata 143
Cluster Variable VARPSU99
Number of Clusters 460
Weight Variable PERWT99F
Model Generalized Logit
Optimization Technique Newton-Raphson
Variance Adjustment Degrees of Freedom (DF)

Output 87.2.3 displays the number of observations and the total of sampling weights both in the data set and used in the analysis. Only the observations with positive person-level weight are used in the analysis. Therefore, 1,053 observations with zero person-level weights were deleted.

Output 87.2.3 MEPS, Number of Observations
Number of Observations Read 24618
Number of Observations Used 23565
Sum of Weights Read 2.7641E8
Sum of Weights Used 2.7641E8

Output 87.2.4 lists the three insurance coverage levels for the response variable INSCOV99. The "UNINSURED" category is used as the reference category in the model.

Output 87.2.4 MEPS, Response Profile
Response Profile
Ordered
Value
INSCOV99 Total
Frequency
Total
Weight
1 ANY PRIVATE 16130 204403997
2 PUBLIC ONLY 4241 41809572
3 UNINSURED 3194 30197198

Logits modeled use INSCOV99='UNINSURED' as the reference category.


Output 87.2.5 shows the parameterization in the regression model for each categorical independent variable.

Output 87.2.5 MEPS, Classification Levels
Class Level Information
Class Value Design Variables
SEX FEMALE 1      
  MALE -1      
RACEX ALEUT, ESKIMO 1 0 0 0
  AMERICAN INDIAN 0 1 0 0
  ASIAN OR PACIFIC ISLANDER 0 0 1 0
  BLACK 0 0 0 1
  WHITE -1 -1 -1 -1
POVCAT99 HIGH INCOME 1 0 0 0
  LOW INCOME 0 1 0 0
  MIDDLE INCOME 0 0 1 0
  NEAR POOR 0 0 0 1
  NEGATIVE OR POOR -1 -1 -1 -1

Output 87.2.6 displays the parameter estimates and their standard errors.

Output 87.2.7 displays the odds ratio estimates and their standard errors.

For example, after adjusting for the effects of sex, race, and total health care expenditures, a person with high income is estimated to be 11.595 times more likely than a poor person to choose private health care insurance over no insurance, but only 0.274 times as likely to choose public health insurance over no insurance.

Output 87.2.6 MEPS, Parameter Estimates
Analysis of Maximum Likelihood Estimates
Parameter   INSCOV99 DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept   ANY PRIVATE 1 2.7703 0.1906 211.3648 <.0001
Intercept   PUBLIC ONLY 1 1.9216 0.1561 151.4590 <.0001
TOTEXP99   ANY PRIVATE 1 0.000215 0.000071 9.1895 0.0024
TOTEXP99   PUBLIC ONLY 1 0.000241 0.000072 11.1509 0.0008
SEX FEMALE ANY PRIVATE 1 0.1208 0.0248 23.7173 <.0001
SEX FEMALE PUBLIC ONLY 1 0.1741 0.0308 31.9571 <.0001
RACEX ALEUT, ESKIMO ANY PRIVATE 1 7.1457 0.6976 104.9258 <.0001
RACEX ALEUT, ESKIMO PUBLIC ONLY 1 7.6303 0.5022 230.8760 <.0001
RACEX AMERICAN INDIAN ANY PRIVATE 1 -2.0904 0.2615 63.8878 <.0001
RACEX AMERICAN INDIAN PUBLIC ONLY 1 -1.8992 0.2909 42.6095 <.0001
RACEX ASIAN OR PACIFIC ISLANDER ANY PRIVATE 1 -1.8055 0.2299 61.6848 <.0001
RACEX ASIAN OR PACIFIC ISLANDER PUBLIC ONLY 1 -1.9914 0.2285 75.9479 <.0001
RACEX BLACK ANY PRIVATE 1 -1.7517 0.1983 78.0146 <.0001
RACEX BLACK PUBLIC ONLY 1 -1.7038 0.1691 101.4970 <.0001
POVCAT99 HIGH INCOME ANY PRIVATE 1 1.4560 0.0685 452.1829 <.0001
POVCAT99 HIGH INCOME PUBLIC ONLY 1 -0.6092 0.0903 45.5392 <.0001
POVCAT99 LOW INCOME ANY PRIVATE 1 -0.3066 0.0666 21.1762 <.0001
POVCAT99 LOW INCOME PUBLIC ONLY 1 -0.0239 0.0754 0.1007 0.7510
POVCAT99 MIDDLE INCOME ANY PRIVATE 1 0.6467 0.0587 121.1736 <.0001
POVCAT99 MIDDLE INCOME PUBLIC ONLY 1 -0.3496 0.0807 18.7732 <.0001
POVCAT99 NEAR POOR ANY PRIVATE 1 -0.8015 0.1076 55.4443 <.0001
POVCAT99 NEAR POOR PUBLIC ONLY 1 0.2985 0.0952 9.8308 0.0017

Output 87.2.7 MEPS, Odds Ratios
Odds Ratio Estimates
Effect INSCOV99 Point Estimate 95% Wald
Confidence Limits
TOTEXP99 ANY PRIVATE 1.000 1.000 1.000
TOTEXP99 PUBLIC ONLY 1.000 1.000 1.000
SEX FEMALE vs MALE ANY PRIVATE 1.273 1.155 1.403
SEX FEMALE vs MALE PUBLIC ONLY 1.417 1.255 1.598
RACEX ALEUT, ESKIMO vs WHITE ANY PRIVATE >999.999 >999.999 >999.999
RACEX ALEUT, ESKIMO vs WHITE PUBLIC ONLY >999.999 >999.999 >999.999
RACEX AMERICAN INDIAN vs WHITE ANY PRIVATE 0.553 0.339 0.901
RACEX AMERICAN INDIAN vs WHITE PUBLIC ONLY 1.146 0.603 2.178
RACEX ASIAN OR PACIFIC ISLANDER vs WHITE ANY PRIVATE 0.735 0.499 1.083
RACEX ASIAN OR PACIFIC ISLANDER vs WHITE PUBLIC ONLY 1.045 0.656 1.665
RACEX BLACK vs WHITE ANY PRIVATE 0.776 0.638 0.944
RACEX BLACK vs WHITE PUBLIC ONLY 1.394 1.132 1.717
POVCAT99 HIGH INCOME vs NEGATIVE OR POOR ANY PRIVATE 11.595 9.301 14.455
POVCAT99 HIGH INCOME vs NEGATIVE OR POOR PUBLIC ONLY 0.274 0.213 0.353
POVCAT99 LOW INCOME vs NEGATIVE OR POOR ANY PRIVATE 1.990 1.607 2.464
POVCAT99 LOW INCOME vs NEGATIVE OR POOR PUBLIC ONLY 0.492 0.395 0.614
POVCAT99 MIDDLE INCOME vs NEGATIVE OR POOR ANY PRIVATE 5.162 4.200 6.343
POVCAT99 MIDDLE INCOME vs NEGATIVE OR POOR PUBLIC ONLY 0.356 0.280 0.451
POVCAT99 NEAR POOR vs NEGATIVE OR POOR ANY PRIVATE 1.213 0.903 1.630
POVCAT99 NEAR POOR vs NEGATIVE OR POOR PUBLIC ONLY 0.680 0.527 0.877