This example is patterned after a quantile regression analysis of covariates associated with birth weight that was carried out by Koenker and Hallock (2001). Their study used a subset of the June 1997 Detailed Natality Data published by the National Center for Health Statistics and demonstrated that conditional quantile functions provide more complete information about the covariate effects than ordinary least squares regression.
As in Koenker and Hallock (2001) and Abreveya (2001), this example uses data for live, singleton births to mothers in the United States who were recorded as black or white, and who were between the ages of 18 and 45. For convenience, this example uses 50,000 observations, which were randomly selected from the qualified observations. Observations with missing data for any of the variables were deleted.
The following table describes the variables in the data.
Variable 
Description 

Weight 
Infant’s birth weight 

Black 
Indicator of black mother 

Married 
Indicator of married mother 

Boy 
Indicator of boy 

Visit 
Prenatal visit: 0 = no visit, 1 = visit in second trimester, 

2 = visit in last trimester, 3 = visit in first trimester 

Ed 
Mother’s education level: 0 = high school, 1 = some college, 

2 = college, 3 = less than high school 

Smoke 
Indicator of smoking mother 

CigsPer 
Number of cigarettes smoked per day 

Mom_Age 
Mother’s age 

M_WtGain 
Mother’s weight gain during pregnancy 
There are four levels of education of the mother. By default, the QUANTREG procedure treats the highest level (3  less than high school) as a reference level. The regression coefficients of other levels measure the effect relative to this level. Likewise, there are four levels of prenatal medical care of the mother, and a first visit in the first trimester serves as the reference level. These two variables are treated as classification variables in the model.
The following statements fit a regression model for 19 quantiles of birth weight, which are evenly spaced in the interval . The model includes linear and quadratic effects for the age of the mother and for weight gain during pregnancy.
ods graphics on; proc quantreg ci=sparsity/iid algorithm=interior(tolerance=5.e4) data=sashelp.bweight; class visit ed; model weight = black married boy visit ed smoke cigsper mom_age mom_age*mom_age m_wtgain m_wtgain*m_wtgain / quantile= 0.05 to 0.95 by 0.05 plot=quantplot; run;
BMI Percentiles for Men: 280 Years Old 
Model Information  

Data Set  SASHELP.BWEIGHT 
Dependent Variable  weight 
Number of Independent Variables  9 
Number of Continuous Independent Variables  7 
Number of Class Independent Variables  2 
Number of Observations  50000 
Optimization Algorithm  Interior 
Method for Confidence Limits  Sparsity 
Summary Statistics  

Variable  Q1  Median  Q3  Mean  Standard Deviation 
MAD 
black  0  0  0  0.1628  0.3692  0 
married  0  1.0000  1.0000  0.7126  0.4525  0 
boy  0  1.0000  1.0000  0.5158  0.4998  0 
smoke  0  0  0  0.1307  0.3370  0 
cigsper  0  0  0  1.4766  4.6541  0 
mom_age  4.0000  0  5.0000  0.4161  5.7285  5.9304 
mom_age*mom_age  4.0000  16.0000  49.0000  32.9877  39.2861  22.2390 
m_wtgain  8.0000  0  9.0000  0.7092  12.8761  11.8608 
m_wtgain*m_wtgain  16.0000  64.0000  196.0  166.3  298.8  88.9561 
weight  3062.0  3402.0  3720.0  3370.8  566.4  504.1 
Output 75.3.1 displays the model information and summary statistics for the variables in the model.
Among the 11 independent variables, Black, Married, Boy, and Smoke are binary variables. For these variables, the mean represents the proportion in the category. The two continuous variables, Mom_Age and M_WtGain, are centered at their medians, which are 27 and 30, respectively.
The quantile plots for the intercept and the other 15 factors with nonzero degree of freedom are shown in the following four panels. In each plot, the regression coefficient at a given quantile indicates the effect on birth weight of a unit change in that factor, assuming that the other factors are fixed. The bands represent 95 confidence intervals.
Although the data set used here is a subset of the Natality data set, the results are quite similar to those of Koenker and Hallock (2001) for the full data set.
In Output 75.3.2, the first plot is for the intercept. As explained by Koenker and Hallock (2001), the intercept "may be interpreted as the estimated conditional quantile function of the birthweight distribution of a girl born to an unmarried, white mother with less than a high school education, who is 27 years old and had a weight gain of 30 pounds, didn’t smoke, and had her first prenatal visit in the first trimester of the pregnancy."
The second plot shows that infants born to black mothers weigh less than infants born to white mothers, especially in the lower tail of the birthweight distribution. The third plot shows that marital status has a large positive effect on birth weight, especially in the lower tail. The fourth plot shows that boys weigh more than girls for any chosen quantile; this difference is smaller in the lower quantiles of the distribution.
In Output 75.3.3, the first three plots deal with prenatal care. Compared with babies born to mothers who had a prenatal visit in the first trimester, babies born to mothers who received no prenatal care weigh less, especially in the lower quantiles of the birthweight distributions. As noted by Koenker and Hallock (2001), "babies born to mothers who delayed prenatal visits until the second or third trimester have substantially higher birthweights in the lower tail than mothers who had a prenatal visit in the first trimester. This might be interpreted as the selfselection effect of mothers confident about favorable outcomes."
The fourth plot in Output 75.3.3 and the first two plots in Output 75.3.4 are for variables related to education. Education beyond high school is associated with a positive effect on birth weight. The effect of high school education is uniformly around 15 grams across the entire birthweight distribution (this is a pure location shift effect), while the effect of some college and college education is more positive in the lower quantiles than the upper quantiles.
The remaining two plots in Output 75.3.4 show that smoking is associated with a large negative effect on birth weight.
The linear and quadratic effects for the two continuous variables are shown in Output 75.3.5. Both of these variables are centered at their median. At the lower quantiles, the quadratic effect of the mother’s age is more concave. The optimal age at the first quantile is about 33, and the optimal age at the third quantile is about 38. The effect of the mother’s weight gain is clearly positive, as indicated by the narrow confidence bands for both linear and quadratic coefficients.
Refer to Koenker and Hallock (2001) for more details about the covariate effects discovered with quantile regression.