The SURVEYLOGISTIC Procedure

Example 111.2 The Medical Expenditure Panel Survey (MEPS)

The U.S. Department of Health and Human Services conducts the Medical Expenditure Panel Survey (MEPS) to produce national and regional estimates of various aspects of health care. The MEPS has a complex sample design that includes both stratification and clustering. The sampling weights are adjusted for nonresponse and raked with respect to population control totals from the Current Population Survey. See the MEPS Survey Background (2006) and Machlin, Yu, and Zodet (2005) for details.

In this example, the 1999 full-year consolidated data file HC-038 (MEPS HC-038, 2002) from the MEPS is used to investigate the relationship between medical insurance coverage and the demographic variables. The data can be downloaded directly from the Agency for Healthcare Research and Quality (AHRQ) Web site at http://www.meps.ahrq.gov/mepsweb/data_stats/download_data_files_detail.jsp?cboPufNumber=HC-038 in either ASCII format or SAS transport format. The Web site includes a detailed description of the data as well as the SAS program used to access and format it.

For this example, the SAS transport format data file for HC-038 is downloaded to 'C:H38.ssp' on a Windows-based PC. The instructions on the Web site lead to the following SAS statements for creating a SAS data set MEPS, which contains only the sample design variables and other variables necessary for this analysis.

proc format;
   value racex
      -9 = 'NOT ASCERTAINED'
      -8 = 'DK'
      -7 = 'REFUSED'
      -1 = 'INAPPLICABLE'
      1 = 'AMERICAN INDIAN'
      2 = 'ALEUT, ESKIMO'
      3 = 'ASIAN OR PACIFIC ISLANDER'
      4 = 'BLACK'
      5 = 'WHITE'
      91 = 'OTHER'
      ;
   value sex
      -9 = 'NOT ASCERTAINED'
      -8 = 'DK'
      -7 = 'REFUSED'
      -1 = 'INAPPLICABLE'
      1 = 'MALE'
      2 = 'FEMALE'
      ;
   value povcat9h
      1 = 'NEGATIVE OR POOR'
      2 = 'NEAR POOR'
      3 = 'LOW INCOME'
      4 = 'MIDDLE INCOME'
      5 = 'HIGH INCOME'
      ;
   value inscov9f
      1 = 'ANY PRIVATE'
      2 = 'PUBLIC ONLY'
      3 = 'UNINSURED'
      ;
run;
libname mylib '';
filename in1 'H38.SSP';     
proc xcopy in=in1 out=mylib import;
run;
data meps; 
   set mylib.H38;
   label racex= sex= inscov99= povcat99=
      varstr99= varpsu99= perwt99f= totexp99=;
   format racex racex. sex sex.
      povcat99 povcat9h. inscov99 inscov9f.;
   keep inscov99 sex racex povcat99 varstr99 
      varpsu99 perwt99f totexp99;
run;

There are a total of 24,618 observations in this SAS data set. Each observation corresponds to a person in the survey. The stratification variable is VARSTR99, which identifies the 143 strata in the sample. The variable VARPSU99 identifies the 460 PSUs in the sample. The sampling weights are stored in the variable PERWT99F. The response variable is the health insurance coverage indicator variable, INSCOV99, which has three values:

1

The person had any private insurance coverage any time during 1999

2

The person had only public insurance coverage during 1999

3

The person was uninsured during all of 1999

The demographic variables include gender (SEX), race (RACEX), and family income level as a percent of the poverty line (POVCAT99). The variable RACEX has five categories:

1

American Indian

2

Aleut, Eskimo

3

Asian or Pacific Islander

4

Black

5

White

The variable POVCAT99 is constructed by dividing family income by the applicable poverty line (based on family size and composition), with the resulting percentages grouped into five categories:

1

Negative or poor (less than 100%)

2

Near poor (100% to less than 125%)

3

Low income (125% to less than 200%)

4

Middle income (200% to less than 400%)

5

High income (greater than or equal to 400%)

The data set also contains the total health care expenditure in 1999, TOTEXP99, which is used as a covariate in the analysis.

Output 111.2.1 displays the first 30 observations of this data set.

Output 111.2.1: 1999 Full-Year MEPS (First 30 Observations)

Obs SEX RACEX POVCAT99 INSCOV99 TOTEXP99 PERWT99F VARSTR99 VARPSU99
1 MALE WHITE MIDDLE INCOME PUBLIC ONLY 2735 14137.86 131 2
2 FEMALE WHITE MIDDLE INCOME ANY PRIVATE 6687 17050.99 131 2
3 MALE WHITE MIDDLE INCOME ANY PRIVATE 60 35737.55 131 2
4 MALE WHITE MIDDLE INCOME ANY PRIVATE 60 35862.67 131 2
5 FEMALE WHITE MIDDLE INCOME ANY PRIVATE 786 19407.11 131 2
6 MALE WHITE MIDDLE INCOME ANY PRIVATE 345 18499.83 131 2
7 MALE WHITE MIDDLE INCOME ANY PRIVATE 680 18499.83 131 2
8 MALE WHITE MIDDLE INCOME ANY PRIVATE 3226 22394.53 136 1
9 FEMALE WHITE MIDDLE INCOME ANY PRIVATE 2852 27008.96 136 1
10 MALE WHITE MIDDLE INCOME ANY PRIVATE 112 25108.71 136 1
11 MALE WHITE MIDDLE INCOME ANY PRIVATE 3179 17569.81 136 1
12 MALE WHITE MIDDLE INCOME ANY PRIVATE 168 21478.06 136 1
13 FEMALE WHITE MIDDLE INCOME ANY PRIVATE 1066 21415.68 136 1
14 MALE WHITE NEGATIVE OR POOR PUBLIC ONLY 0 12254.66 125 1
15 MALE WHITE NEGATIVE OR POOR ANY PRIVATE 0 17699.75 125 1
16 FEMALE WHITE NEGATIVE OR POOR UNINSURED 0 18083.15 125 1
17 MALE BLACK NEGATIVE OR POOR PUBLIC ONLY 230 6537.97 78 10
18 MALE WHITE LOW INCOME UNINSURED 408 8951.36 95 2
19 FEMALE WHITE LOW INCOME UNINSURED 0 11833.00 95 2
20 MALE WHITE LOW INCOME UNINSURED 40 12754.07 95 2
21 FEMALE WHITE LOW INCOME UNINSURED 51 14698.57 95 2
22 MALE WHITE LOW INCOME UNINSURED 0 3890.20 92 19
23 FEMALE WHITE LOW INCOME UNINSURED 610 5882.29 92 19
24 MALE WHITE LOW INCOME PUBLIC ONLY 24 8610.47 92 19
25 FEMALE BLACK MIDDLE INCOME UNINSURED 1758 0.00 64 1
26 MALE BLACK MIDDLE INCOME PUBLIC ONLY 551 7049.70 64 1
27 MALE BLACK MIDDLE INCOME ANY PRIVATE 65 34067.03 64 1
28 FEMALE BLACK NEGATIVE OR POOR PUBLIC ONLY 0 9313.84 73 12
29 FEMALE BLACK NEGATIVE OR POOR PUBLIC ONLY 10 14697.03 73 12
30 MALE BLACK NEGATIVE OR POOR PUBLIC ONLY 0 4574.73 73 12



The following SAS statements fit a generalized logit model for the 1999 full-year consolidated MEPS data:

proc surveylogistic data=meps;
   stratum VARSTR99;
   cluster VARPSU99;
   weight PERWT99F;
   class SEX RACEX POVCAT99;
   model INSCOV99 = TOTEXP99 SEX RACEX POVCAT99 / link=glogit;
run;

The STRATUM statement specifies the stratification variable VARSTR99. The CLUSTER statement specifies the PSU variable VARPSU99. The WEIGHT statement specifies the sample weight variable PERWT99F. The demographic variables SEX, RACEX, and POVCAT99 are listed in the CLASS statement to indicate that they are categorical independent variables in the MODEL statement. In the MODEL statement, the response variable is INSCOV99, and the independent variables are TOTEXP99 along with the selected demographic variables. The LINK= option requests that the procedure fit the generalized logit model because the response variable INSCOV99 has nominal responses.

The results of this analysis are shown in the following outputs.

PROC SURVEYLOGISTIC lists the model fitting information and sample design information in Output 111.2.2.

Output 111.2.2: MEPS, Model Information

The SURVEYLOGISTIC Procedure

Model Information
Data Set WORK.MEPS
Response Variable INSCOV99
Number of Response Levels 3
Stratum Variable VARSTR99
Number of Strata 143
Cluster Variable VARPSU99
Number of Clusters 460
Weight Variable PERWT99F
Model Generalized Logit
Optimization Technique Newton-Raphson
Variance Adjustment Degrees of Freedom (DF)



Output 111.2.3 displays the number of observations and the total of sampling weights both in the data set and used in the analysis. Only the observations with positive person-level weight are used in the analysis. Therefore, 1,053 observations with zero person-level weights were deleted.

Output 111.2.3: MEPS, Number of Observations

Number of Observations Read 24618
Number of Observations Used 23565
Sum of Weights Read 2.7641E8
Sum of Weights Used 2.7641E8



Output 111.2.4 lists the three insurance coverage levels for the response variable INSCOV99. The "UNINSURED" category is used as the reference category in the model.

Output 111.2.4: MEPS, Response Profile

Response Profile
Ordered
Value
INSCOV99 Total
Frequency
Total
Weight
1 ANY PRIVATE 16130 204403997
2 PUBLIC ONLY 4241 41809572
3 UNINSURED 3194 30197198

Logits modeled use INSCOV99='UNINSURED' as the reference category.




Output 111.2.5 shows the parameterization in the regression model for each categorical independent variable.

Output 111.2.5: MEPS, Classification Levels

Class Level Information
Class Value Design Variables
SEX FEMALE 1      
  MALE -1      
RACEX ALEUT, ESKIMO 1 0 0 0
  AMERICAN INDIAN 0 1 0 0
  ASIAN OR PACIFIC ISLANDER 0 0 1 0
  BLACK 0 0 0 1
  WHITE -1 -1 -1 -1
POVCAT99 HIGH INCOME 1 0 0 0
  LOW INCOME 0 1 0 0
  MIDDLE INCOME 0 0 1 0
  NEAR POOR 0 0 0 1
  NEGATIVE OR POOR -1 -1 -1 -1



Output 111.2.6 displays the parameter estimates and their standard errors.

Output 111.2.7 displays the odds ratio estimates and their standard errors.

For example, after adjusting for the effects of sex, race, and total health care expenditures, a person with high income is estimated to be 11.595 times more likely than a poor person to choose private health care insurance over no insurance, but only 0.274 times as likely to choose public health insurance over no insurance.

Output 111.2.6: MEPS, Parameter Estimates

Analysis of Maximum Likelihood Estimates
Parameter   INSCOV99 Estimate Standard
Error
t Value Pr > |t|
Intercept   ANY PRIVATE 2.7703 0.1906 14.54 <.0001
Intercept   PUBLIC ONLY 1.9216 0.1562 12.30 <.0001
TOTEXP99   ANY PRIVATE 0.000215 0.000071 3.03 0.0026
TOTEXP99   PUBLIC ONLY 0.000241 0.000072 3.34 0.0009
SEX FEMALE ANY PRIVATE 0.1208 0.0248 4.87 <.0001
SEX FEMALE PUBLIC ONLY 0.1741 0.0308 5.65 <.0001
RACEX ALEUT, ESKIMO ANY PRIVATE 7.1457 0.6976 10.24 <.0001
RACEX ALEUT, ESKIMO PUBLIC ONLY 7.6303 0.5024 15.19 <.0001
RACEX AMERICAN INDIAN ANY PRIVATE -2.0904 0.2615 -7.99 <.0001
RACEX AMERICAN INDIAN PUBLIC ONLY -1.8992 0.2909 -6.53 <.0001
RACEX ASIAN OR PACIFIC ISLANDER ANY PRIVATE -1.8055 0.2299 -7.85 <.0001
RACEX ASIAN OR PACIFIC ISLANDER PUBLIC ONLY -1.9914 0.2285 -8.71 <.0001
RACEX BLACK ANY PRIVATE -1.7517 0.1983 -8.83 <.0001
RACEX BLACK PUBLIC ONLY -1.7038 0.1692 -10.07 <.0001
POVCAT99 HIGH INCOME ANY PRIVATE 1.4560 0.0685 21.26 <.0001
POVCAT99 HIGH INCOME PUBLIC ONLY -0.6092 0.0903 -6.75 <.0001
POVCAT99 LOW INCOME ANY PRIVATE -0.3066 0.0666 -4.60 <.0001
POVCAT99 LOW INCOME PUBLIC ONLY -0.0239 0.0754 -0.32 0.7512
POVCAT99 MIDDLE INCOME ANY PRIVATE 0.6467 0.0587 11.01 <.0001
POVCAT99 MIDDLE INCOME PUBLIC ONLY -0.3496 0.0807 -4.33 <.0001
POVCAT99 NEAR POOR ANY PRIVATE -0.8015 0.1076 -7.45 <.0001
POVCAT99 NEAR POOR PUBLIC ONLY 0.2985 0.0952 3.14 0.0019
NOTE: The degrees of freedom for the t tests is 317.



Output 111.2.7: MEPS, Odds Ratios

Odds Ratio Estimates
Effect INSCOV99 Point Estimate 95% Confidence Limits
TOTEXP99 ANY PRIVATE 1.000 1.000 1.000
TOTEXP99 PUBLIC ONLY 1.000 1.000 1.000
SEX FEMALE vs MALE ANY PRIVATE 1.273 1.155 1.404
SEX FEMALE vs MALE PUBLIC ONLY 1.417 1.255 1.599
RACEX ALEUT, ESKIMO vs WHITE ANY PRIVATE >999.999 >999.999 >999.999
RACEX ALEUT, ESKIMO vs WHITE PUBLIC ONLY >999.999 >999.999 >999.999
RACEX AMERICAN INDIAN vs WHITE ANY PRIVATE 0.553 0.339 0.903
RACEX AMERICAN INDIAN vs WHITE PUBLIC ONLY 1.146 0.601 2.185
RACEX ASIAN OR PACIFIC ISLANDER vs WHITE ANY PRIVATE 0.735 0.499 1.084
RACEX ASIAN OR PACIFIC ISLANDER vs WHITE PUBLIC ONLY 1.045 0.655 1.670
RACEX BLACK vs WHITE ANY PRIVATE 0.776 0.638 0.944
RACEX BLACK vs WHITE PUBLIC ONLY 1.394 1.129 1.721
POVCAT99 HIGH INCOME vs NEGATIVE OR POOR ANY PRIVATE 11.595 9.293 14.467
POVCAT99 HIGH INCOME vs NEGATIVE OR POOR PUBLIC ONLY 0.274 0.213 0.353
POVCAT99 LOW INCOME vs NEGATIVE OR POOR ANY PRIVATE 1.990 1.606 2.466
POVCAT99 LOW INCOME vs NEGATIVE OR POOR PUBLIC ONLY 0.492 0.395 0.615
POVCAT99 MIDDLE INCOME vs NEGATIVE OR POOR ANY PRIVATE 5.162 4.197 6.348
POVCAT99 MIDDLE INCOME vs NEGATIVE OR POOR PUBLIC ONLY 0.356 0.280 0.452
POVCAT99 NEAR POOR vs NEGATIVE OR POOR ANY PRIVATE 1.213 0.901 1.632
POVCAT99 NEAR POOR vs NEGATIVE OR POOR PUBLIC ONLY 0.680 0.526 0.878
NOTE: The degrees of freedom in computing the confidence limits is 317.