This section
requires familiarity
with Using Classification Panels. You should skip this section if you are not familiar with the general
coding for classification panels.
The DATALATTICE and DATAPANEL layouts provide INSET= and INSETOPTS=
options for displaying insets in classification panels. The INSETOPTS=
option supports the same placement and appearance features as those
documented for the SCATTERPLOTMATRIX statement
in Adding Insets to a SCATTERPLOTMATRIX Graph However, unlike
the SCATTERPLOTMATRIX statement, the DATALATTICE and DATAPANEL layouts
do not have predefined information available. Thus, for the INSET=
option, you must create the columns for the information that you want
to display in the inset and integrate it with the input data before
the graph is rendered. Then, on the INSET= option, you specify the
name(s) of the column(s) that contain the desired information.
For example,
the following template code uses INSET=(NOBS MEAN) to reference input
data columns that are named NOBS and MEAN. When the graph is rendered,
the values that are stored in these columns will be displayed in the
inset.
In the
inset display in this example, one row is displayed for each column
that is listed on INSET=, and each row has two columns. The left column
shows the column name (column label, if it is defined in the data),
and the right column contains the column value for that particular
cell of the panel. The number of rows of data for these columns should
match the number of cells in the classification panel and the sequence
in which the cells are populated.
The following
template code defines a template named PANEL. The template "makes
room" for the insets in each panel by adding a maximum row axis offset.
In this case, OFFSETMAX=0.4 is sufficient, but the setting will vary
case-by-case. This is what the first row of the classification panel
with insets will look like:
proc template;
define statgraph panel;
begingraph;
entrytitle "Average City MPG for Vehicles";
entrytitle "by Origin, Cylinders and VehicleType";
layout datalattice columnvar=origin rowvar=cylinders /
columndatarange=unionall rowdatarange=unionall
headerlabeldisplay=value
headerbackgroundcolor=GraphAltBlock:color
inset=(cellN cellMean)
insetopts=(border=true
opaque=true backgroundcolor=GraphAltBlock:color)
rowaxisopts=( offsetmax=.4 offsetmin=.1 display=(tickvalues) )
columnaxisopts=(display=(label tickvalues)
linearopts=( tickvaluepriority=true
tickvaluesequence=(start=5 end=30 increment=5))
griddisplay=on offsetmin=0 offsetmax=.1);
layout prototype;
barchart x=type y=mean / orient=horizontal
barwidth=.5 barlabel=true;
endlayout;
endlayout;
endgraph;
end;
run;
When this
template is used, the input data must contain separate columns for
the following:
|
columnvar=origin
rowvar=cylinders
|
|
|
|
|
The data
for this example is from the SASHELP.CARS data set. To calculate the
number of observations and mean for the observations, we can use PROC
SUMMARY.
The following
PROC SUMMARY step calculates the number of observations and the mean
of MPG_CITY for each of the classification interactions listed in
the TYPES statement. CYLINDERS*ORIGIN is the crossing needed for the
cell summaries, and CYLINDER*ORIGIN*TYPE is the crossing needed by
each cell's bar chart.
The COMPLETETYPES
option creates summary observations even when the frequency of the
classification interactions is zero. Additionally, the code creates
subsets in the input data to restrict the number of bars in each bar
chart to at most three, and to reduce the number cells in the classification
panel. There are three values of ORIGIN (Asia, Europe, and USA) and
three values of CYLINDERS (4, 6, and 8).
For the
insets to display accurate data, we must ensure that the order of
the observations in the data corresponds to the column order for the
CLASS statement of PROC SUMMARY. Because the panel cells are populated
across one row before proceeding to the next row, the values of the
panel's row variable (CYLINDERS) determines the panel order and must
be specified first in the SUMMARY procedure's CLASS statement so that
the values of CYLINDERS also determine the order for the statistics
calculations.
/* compute the barchart data and inset information */
proc summary data=sashelp.cars completetypes;
where type in ("Sedan" "Truck" "SUV") and
cylinders in (4 6 8);
class cylinders origin type;
var mpg_city;
output out=mileage mean=Mean n=Nobs / noinherit;
types cylinders*origin cylinders*origin*type;
run;
The SAS
log displays the following note when the procedure code is submitted:
NOTE: There were 337 observations read from the data set SASHELP.CARS.
WHERE type in ('SUV', 'Sedan', 'Truck') and cylinders in (4, 6, 8);
NOTE: The data set WORK.MILEAGE has 36 observations and 6 variables.
Confirm the Order of Data Observations
Confirm the Order of Data Observations shows the order of observations in
the interim data set named MILEAGE. Notice that the first nine observations
(where _TYPE_ equals 6) are the cell summaries. The remaining 27 observations
(where _TYPE_ equals 7) are for each cell's bar chart.
To create
separate columns for the inset, we need to store the _TYPE_= 6 observations
in new columns. The following DATA step writes the inset information
to another data set named OVERALL.
data mileage
overall(keep=origin cylinders mean nobs
rename=(origin=cellOrigin cylinders=cellCyl
mean=cellMean nobs=cellNobs ));
set mileage; by _type_;
if _type_ eq 6 then output overall;
else output mileage;
run;
The SAS
log displays the following note when the code is submitted:
NOTE: There were 36 observations read from the data set WORK.MILEAGE.
NOTE: The data set WORK.MILEAGE has 27 observations and 5 variables.
NOTE: The data set WORK.OVERALL has 9 observations and 4 variables.
Finally,
we create a new data set named SUMMARY, which merges the MILEAGE and
OVERALL data sets. Note that this is a non-match merge (no BY statement),
and that all columns in the two tables have unique names to prevent
overwriting any data values.
data summary;
merge mileage overall;
label Mean="MPG (City)";
format mean cellMean 4.1;
run;
NOTE: There were 27 observations read from the data set WORK.MILEAGE.
NOTE: There were 9 observations read from the data set WORK.OVERALL.
NOTE: The data set WORK.SUMMARY has 27 observations and 9 variables.
Modified Input Data Set with Additional Columns
The SUMMARY
data set can now be used to render a graph from template PANEL:
ods listing style=statistical;
proc sgrender data=summary template=panel;
run;
The following
figure shows another example of adding insets to a classification
panel. The complete code for this output is
presented inUsing Classification Panels.