Example Program and Statement Details

Example Graph

The following graph was generated by the Example Program:
Example Matrix of Scatter Plots

Example Program

proc template;
  define statgraph scatterplotmatrix;
    begingraph;
      entrytitle "Scatter Plot Matrix";
      layout gridded;
        scatterplotmatrix
          sepallength sepalwidth petallength petalwidth /
          group=species name="matrix";
        discretelegend "matrix";
      endlayout;
    endgraph;
  end;
run;

proc sgrender data=sashelp.iris template=scatterplotmatrix;
run;

Statement Summary

By default, the SCATTERPLOTMATRIX statement produces a symmetric scatter plot matrix. For n columns, it produces an n columns by n rows matrix of scatter plots. By default, the columns of the matrix are in the same left-to-right order as the order of the numeric-column-list. The rows of the matrix are in the same bottom-to-top order as the numeric-column-list. You can reverse the direction of the diagonal by setting START=TOPLEFT.
To produce a rectangular matrix of scatter plots, use the ROWVARS= option. Specifying n columns in the SCATTERPLOTMATRIX statement and m columns on the ROWVARS= option produces an n-columns by m-rows matrix of scatter plots. For example, the following statement specifies 2 columns on SCATTERPLOTMATRIX and 3 columns on the ROWVARS= option to produce the 2-columns by 3-rows matrix:
   SCATTERPLOTMATRIX Height Weight
      / ROWVARS=(Age Height Weight);
The SCATTERPLOTMATRIX statement cannot appear within an overlay-type layout. It generates its own matrix of plots and is typically placed in a LAYOUT GRIDDED block.
If there are missing values in a column or a row, all of the points that can be plotted are plotted in each scatter plot.

Required Arguments

numeric-column-list
specifies a list of numeric columns to plot. There must be at least two columns to produce a useful matrix.
The default width is 640px, and the default height is 480px. The graph size is not automatically adjusted to accommodate a large number of columns.
To change the graph size for the current template, use the DESIGNHEIGHT= and DESIGNWIDTH= options in the BEGINGRAPH statement. To change the graph size for all templates in the current SAS session, use the HEIGHT= and WIDTH= options in the ODS GRAPHICS statement. Size settings in the ODS GRAPHICS statement override size settings in the BEGINGRAPH statement.
You can also limit the number of columns in the matrix (perhaps to seven in each dimension, for example) so that the resulting graphs are not too small to be useful.

Options

Statement Option
Description
Specifies a style element to be used with the MARKERCOLORGRADIENT= option.
Specifies options for computing measures of association between pairs of variables.
Specifies a column for marker labels.
Specifies the color and font attributes of the data labels.
Specifies the location of the data labels relative to the markers.
Specifies the degree of the transparency of the markers.
Specifies whether the diagonal cells of the matrix are labeled with the labels (names) of the required arguments, or with a graph. The graph for each diagonal cell consists of an overlay combination of a histogram, normal, or kernel curves.
Specifies that a confidence ellipse be included in each cell containing a scatter plot.
Specifies a column that indicates a frequency count for each observation of the input data object.
Creates a distinct set of scatter markers, error bars, and data labels for each unique group value of the specified column.
Specifies whether missing values of the group variable are included in the plot.
Specifies indices for mapping marker attributes (color and symbol) to one of the GraphData1–GraphDataN style elements.
Specifies what information is displayed in an inset.
Specifies the location and appearance options for the inset information.
Specifies the attributes of the data markers.
Specifies a column that defines strings to be used instead of marker symbols.
Specifies the color and font attributes of the marker character specified on the MARKERCHARACTER= option.
Specifies the column that is used to map marker colors to a continuous gradient.
Assigns a name to a plot statement for reference in other template statements.
Specifies whether to reverse a gradient defined by the COLORMODEL= option.
Specifies user-defined roles that can be used to display information in the tooltips.
Specifies a secondary list of columns to be paired with the required column list that is specified by the SCATTERPLOTMATRIX statement.
Specifies whether to start populating the rows of the matrix from the top left or the bottom left corner.
Specifies the information to display when the cursor is positioned over the scatter points.
Specifies display formats for information defined by roles.
Specifies display labels for information defined by roles.
Specifies the fill color of the plot wall area.
Specifies whether the plot’s wall and wall outline are displayed.
COLORMODEL=style-element
specifies a style element to be used with the MARKERCOLORGRADIENT= option.
Default: The ThreeColorAltRamp style element.
style-element
Name of a style element. The style element should contain these style attributes:
STARTCOLOR color for the smallest data value of the column that is specified on the MARKERCOLORGRADIENT= option
NEUTRALCOLOR color for the midpoint of the range of the column that is specified on the MARKERCOLORGRADIENT= option
ENDCOLOR color for the highest data value of the column that is specified on the MARKERCOLORGRADIENT= option
Interaction: For this option to take effect, the MARKERCOLORGRADIENT= option must also be specified.
Interaction: The REVERSECOLORMODEL= option can be used to reverse the start and end colors of the ramp assigned to the color model.
CORROPTS=(correlation-options)
specifies options for computing measures of association between pairs of variables.
The following correlation-options are available:
EXCLNPWGT = FALSE | TRUE
specifies whether observations with non-positive weight values are excluded (TRUE) from the analysis.
Default: FALSE (observations with negative weights are treated like those with zero weights and counted in the total number of observations).
NOMISS = FALSE | TRUE
specifies whether observations with missing values are excluded (TRUE) from the analysis.
Default: FALSE (correlation statistics are computed using all of the nonmissing pairs of variables).
Using NOMISS=TRUE is computationally more efficient.
WEIGHT = numeric-column
specifies a weighting variable to use in the calculation of Pearson weighted product-moment correlation. The observations with missing weights are excluded from the analysis.
Default: For observations with non-positive weights, the weights are set to zero and the observations are included in the analysis.
You can include EXCLNPWGT among the correlation-options to exclude observations with negative or zero weights from the analysis. If you use this WEIGHT correlation-option, consider which value of the VARDEF= correlation-option is appropriate.
VARDEF=DF | N | WDF | WEIGHT
specifies the variance divisor in the calculation of variances and covariances.
Default: DF
DF Degrees of Freedom (N – 1)
N number of observations
WDF sum of weights minus 1 (WEIGHT – 1)
WEIGHT sum of weights
Interaction: This option has effect only when the INSET= option is also used.
For statistical and computational details of these options, see PROC CORR in the documentation for Base SAS.
DATALABEL=column
specifies a column for marker labels. The label positions are adjusted to prevent the labels from overlapping.
Default: no data labels are displayed
Interaction: If a numeric column is specified and the column has no format, a BEST6. format is applied.
Interaction: This option is ignored if the MARKERCHARACTER= option is used.
DATALABELPOSITION = AUTO | TOPRIGHT | TOP | TOPLEFT | LEFT | CENTER | RIGHT | BOTTOMLEFT | BOTTOM | BOTTOMRIGHT
specifies the location of the data labels relative to the markers.
Default: AUTO
DATALABELATTRS=style-element | style-element (text-options) | (text-options)
specifies the color and font attributes of the data labels. See General Syntax for Attribute Options for the syntax on using a style-element and Text Options for available text-options.
Default:
  • For non-grouped data, the GraphDataText style element.
  • For grouped data, the GraphData1:ContrastColor–GraphDataN:ContrastColor style references.
Interaction: For this option to take effect, the DATALABEL= option must also be specified.
Interaction: This option is ignored if the MARKERCHARACTER= option is specified.
DATATRANSPARENCY=number
specifies the degree of the transparency of the markers.
Default: 0
Range: 0 (opaque) to 1 (entirely transparent)
DIAGONAL= LABEL | (graph-list)
specifies whether the diagonal cells of the matrix are labeled with the labels (names) of the required arguments, or with a graph. The graph for each diagonal cell consists of an overlay combination of a histogram, normal, or kernel curves.
Default: LABEL. Variable labels (or names) are displayed in the diagonal cells.
The graph-list can specify one or more of the following:
HISTOGRAM
specifies a histogram
NORMAL
specifies a normal density curve
KERNEL
specifies a kernel density estimate.
Requirement: When specifying multiple graphs in the graph-list, you must separate the values with a space. For example, the following specification requests both a histogram and a normal density curve in each diagonal cell:
   DIAGONAL=(HISTOGRAM NORMAL)
Interaction: The computation for HISTOGRAM, NORMAL, and KERNEL is always computed on all the data for the current variable (including the FREQ= variable, if used). The GROUP= option is not considered in any of these computations.
Interaction: This option is ignored if the ROWVARS= option is used.
When this option is specified, the labels are drawn around the outside of the matrix, and the matrix axes are dropped.
ELLIPSE=(<ellipse-suboptions>)
specifies that a confidence ellipse be included in each cell containing a scatter plot. The ellipse is always drawn behind the scatter points.
The ellipse-suboptions include the following:
TYPE=MEAN | PREDICTED
specifies the type of ellipse.
Default: MEAN
See also: For statistical details about how the ellipse is calculated, see ELLIPSE Statement.
MEAN specifies a confidence ellipse of the mean
PREDICTED specifies a prediction ellipse of the data
ALPHA=positive-number
specifies the confidence level to compute for each ellipse.
Default: .05
Range: 0 < number < 1
ALPHA=.05 represents a 95% confidence level.
Default: TYPE=MEAN ALPHA=.05 You can set defaults by specifying the option without arguments: ELLIPSE=( ).
Interaction: The ellipse might be clipped by the data range for the scatter points.
Interaction: The ellipse is always computed on all the data for the current pair of X and Y variables (including the FREQ= variable, if used). The GROUP= option is not considered when computed the ellipse.
The display properties of each ellipse are controlled by the style elements:
  • The GraphDataDefault element controls the outline and fill properties.
  • The GraphEllipse element controls whether the outline, fill, or both are shown.
FREQ= numeric-column | expression
specifies a column that indicates a frequency count for each observation of the input data object. If n is the value of the FREQ variable for a given observation, then that observation is plotted n times.
Default: Each observation is plotted once.
Restriction: If the value of the numeric-column is missing or is less than 1, the observation is not used in the analysis. If the value is not an integer, only the integer portion is used.
GROUP=column | discrete-attr-var | expression
creates a distinct set of scatter markers, error bars, and data labels for each unique group value of the specified column.
discrete-attr-var
specifies a discrete attribute variable that is defined in a DISCRETEATTRVAR statement.
Restriction: A discrete attribute variable specification must be a direct reference to the attribute variable. It cannot be set by a dynamic variable.
Default: Each distinct group value might be represented in the graph by a different combination of color and marker symbol. Markers vary according to the ContrastColor and MarkerSymbol attributes of the GraphData1–GraphDataN style elements.
Interaction: The group values are mapped in the order of the data, unless the INDEX= option is used to alter the default sequence of markers and colors.
Interaction: The marker size is set by the MARKERATTRS= option.
Interaction: If the MARKERCHARACTER= and MARKERCOLORGRADIENT= options are used, their settings override the group settings for marker symbol and marker color.
Interaction: The INCLUDEMISSINGGROUP= option controls whether missing group values are considered a distinct group value.
Tip: The representations that are used to identify the groups can be overridden. For example, each distinct group value is represented by a different marker symbol, but the MARKERATTRS=(SYMBOL=marker) option could be used to assign the same symbol to all of the plot’s marker symbols, letting marker color indicate group values. Likewise, MARKERATTRS=(COLOR=color) could be used to assign the same color to all markers, letting marker symbol indicate group values.
INCLUDEMISSINGGROUP=boolean
specifies whether missing values of the group variable are included in the plot.
Default: TRUE
Interaction: For this option to take effect, the GROUP= option must also be specified.
Tip: Unless a discrete attribute map is in effect or the INDEX= option is used, the attributes of the missing group value are determined by the GraphMissing style element except when the MISSING= system option is used to specify a non-default missing character or when a user-defined format is applied to the missing group value. In those cases, the attributes of the missing group value are determined by a GraphData1–GraphDataN style element.
INDEX=numeric-column | expression
specifies indices for mapping marker attributes (color and symbol) to one of the GraphData1–GraphDataN style elements.
Default: no default
Restriction: If the value of the numeric-column is missing or is less than 1, the observation is not used in the analysis. If the value is not an integer, only the integer portion is used.
Interaction: For this option to take effect, the GROUP= option must also be specified.
Interaction: All of the indexes for a specific group value must be the same. Otherwise, the results are unpredictable.
Interaction: If this option is not used, then the group values are mapped in the order of the data.
Interaction: If the MARKERCHARACTER= and MARKERCOLORGRADIENT= options are used, their settings override the group settings for marker symbol and marker color.
Interaction: The index values are 1-based indices. For the style elements GraphData1–GraphDataN, if the index value is greater than N, then a modulo operation remaps that index value to a number less than N to determine which style element to use.
Discussion: Indexing can be used to collapse the number of groups that are represented in a graph. For more information, see Remapping Groups for Grouped Data.
INSET= (info-options)
specifies what information is displayed in an inset. Insets appear in all cells of the matrix except the diagonal and are displayed as a small table of name-value pairs.
Default: no default
The following info-options are available:
NOBS
total number of observations where both the X and Y variables have nonmissing values. If the FREQ= option is used, this number is adjusted accordingly. The value of NOBS can be further adjusted by the use of the NOMISS=, WEIGHT=, and EXCLNPWGT= suboptions of the CORROPTS= option.
PEARSON
the Pearson product-moment correlation. The computation of the correlation is affected by the FREQ= and CORROPTS= options. The computation is not done on a per group value when GROUP= is used.
PEARSONPVAL
the probability value for the Pearson product-moment correlation.
The location and appearance of the inset is controlled by the INSETOPTS= option.
Discussion: A typical inset looks like this:
 N      150  
 r 0.96287 
 p(r)  <.0001 
In this example,
NOBS
is represented by N
PEARSON
is represented by r
PEARSONPVAL
is represented by p(r)
For statistical and computational details of these options, see PROC CORR in the documentation for Base SAS.
INSETOPTS = (appearance-options)
specifies location and appearance options for the inset information.
The appearance-options can be any one or more of the settings that follow. The options must be enclosed in parentheses, and each option is specified as a name = value pair.
AUTOALIGN=NONE | AUTO | (location-list)
specifies whether the inset is automatically aligned within the layout.
Default: NONE
NONE Do not automatically align the inset. The inset’s position is therefore set by the HALIGN= and VALIGN= appearance-options.
AUTO Attempt to center this inset in the area that is farthest from any surrounding markers. Data cells might have different inset placements.
(location-list) Restrict this inset’s possible locations to those locations in the specified location-list, and use the location-list position that least collides with the data cell’s other graphics features. The location-list is blank-separated and can contain any of these locations: TOPLEFT TOP TOPRIGHT LEFT CENTER RIGHT BOTTOMLEFT BOTTOM BOTTOMRIGHT. Example: AUTOALIGN = (TOPRIGHT TOPLEFT)
Interaction: When AUTOALIGN=AUTO or (location-list), the enclosing layout statement’s HALIGN= and VALIGN= appearance-options are ignored.
BACKGROUNDCOLOR= style-reference | color
specifies the color of the inset background
Default: GraphWalls:Color style reference
style-reference a reference of the form style-element : style-attribute. Only the style-attribute named COLOR is used.
BORDER= boolean
specifies whether a border is displayed around the inset.
Default: FALSE
HALIGN=LEFT | CENTER | RIGHT
specifies the horizontal alignment of the inset.
Default: LEFT
Interaction: This option is ignored unless AUTOALIGN=NONE.
OPAQUE= boolean
specifies whether the inset background is opaque (TRUE) or transparent (FALSE).
Default: FALSE
Interaction: When this option is set to FALSE, the background color is not used.
TEXTATTRS=style-element | style-element (text-options) | (text-options)
specifies the text properties of the entire inset. See General Syntax for Attribute Options for the syntax on using a style-element and Text Options for available text-options.
Default: The GraphDataText style element.
TITLE= "string"
specifies a title for the inset. The title is added at the top of the inset and spans the full inset width.
Default: no default, and space is not reserved for the title when it is not set
Tip: Text properties for the title string can be set with TITLEATTRS=.
TITLEATTRS=style-element | style-element (text-options) | (text-options)
specifies the text properties of the inset’s title string. See General Syntax for Attribute Options for the syntax on using a style-element and Text Options for available text-options.
Default: The GraphValueText style element.
VALIGN=TOP | CENTER |BOTTOM
specifies the vertical alignment of the inset.
Default: TOP
Interaction: This option is ignored unless AUTOALIGN=NONE.
MARKERATTRS=style-element | style-element (marker-options) | (marker-options)
specifies the attributes of the data markers. See General Syntax for Attribute Options for the syntax on using a style-element and Marker Options for available marker-options.
Default:
  • For non-grouped data, the GraphDataDefault style element.
  • For grouped data, the MarkerSymbol and ContrastColor attributes of the GraphData1–GraphDataN style elements, and the GraphDataDefault:MarkerSize style reference.
Interaction: If the MARKERCOLORGRADIENT= option is specified, this option’s COLOR= setting is ignored.
Interaction: If the MARKERCHARACTER= option is specified, its SYMBOL= and WEIGHT= settings are ignored.
MARKERCHARACTER=column | expression
specifies a column that defines strings to be used instead of marker symbols.
scatter plot that displays strings rather than markers
Default: no default
Interaction: This option overrides the DATALABEL= option.
Interaction: If the GROUP= option is also used, color is displayed for a DISCRETE legend, but the character is not displayed in the legend.
If the GROUP= option is also specified, the same colors are applied to the text strings as would have been applied to markers.
If a numeric column is used, its values are converted to strings using the format associated with the column or BEST6. if no format is defined.
Each string is centered horizontally and vertically at the data point. The data point positions are not adjusted to prevent text overlap.
MARKERCHARACTERATTRS=style-element | style-element (text-options) | (text-options)
specifies the color and font attributes of the marker characters. See General Syntax for Attribute Options for the syntax on using a style-element and Text Options for available text-options.
Default:
  • For non-grouped data, the GraphDataText style element.
  • For grouped data, GraphData1:ContrastColor–GraphDataN:ContrastColor style references.
Interaction: For this option to take effect, the MARKERCHARACTER= option must also be used.
When the GROUP= option is also specified, each distinct group value might be represented by a different color (depending on the ODS style setting or the setting on the INDEX= option). The marker character that is associated with the group is assigned the group color. This option’s COLOR= suboption can be used to specify a single color for all marker characters in a graph, without affecting items that have the group color, such as error bars and marker symbols.
MARKERCOLORGRADIENT=numeric-column | range-attr-var | expression
specifies the column that is used to map marker colors to a continuous gradient.
range-attr-var
specifies a range attribute variable that is defined in a RANGEATTRVAR statement.
Restriction: A range attribute variable specification must be a direct reference to the attribute variable. It cannot be set as a dynamic variable.
Tip: The marker colors are derived from the RANGEALTCOLOR= or RANGEALTCOLORMODEL= option in the RANGEATTRMAP block RANGE statements.
Default: ThreeColorAltRamp style element
Restriction: To display a legend with this option in effect, you must use a CONTINUOUSLEGEND statement, not a DISCRETELEGEND statement.
Interaction: This option overrides the COLOR= setting of the MARKERATTRS= or MARKERCHARACTERATTRS= option.
Interaction: The DATALABELATTRS= option overrides the gradient colors specified by this option for the data labels.
Tip: This option can be used to add a second response variable to an analysis. For example, in an analysis of weight by height, an age column might be specified by the MARKERCOLORGRADIENT= option so that the change in the gradient color of the markers reflects the change in age.
Tip: The COLORMODEL= option allows a different color range to be used.
scatter plot that maps a color gradient to markers
Tip: If the MARKERCHARACTER= option is also used, the gradients that would be applied to the markers are applied to the text strings.
NAME="string"
assigns a name to a plot statement for reference in other template statements.
Default: no default
Restriction: The string is case sensitive, cannot contain spaces, and must define a unique name within the template.
The specified name is used primarily in legend statements to coordinate the use of colors and marker symbols between the graph and the legend.
REVERSECOLORMODEL=boolean
specifies whether to reverse a gradient (color ramp) defined by the COLORMODEL= option.
Default: FALSE
ROLENAME=(role-name-list)
specifies user-defined roles that can be used to display information in the tooltips.
Default: no user-defined roles
(role-name-list)
a blank-separated list of rolename = column pairs.
The following example assigns column ID to the user-defined role TIP1, and columns AGE, HEIGHT, WEIGHT to the user-defined roles TIP2, TIP3, and TIP4.
ROLENAME=(TIP1=ID TIP2=AGE TIP3=HEIGHT TIP4=WEIGHT)
Requirement: The role names that you choose must be unique and different from the pre-defined roles X, Y, DATALABEL, MARKERCHARACTER, MARKERCOLORGRADIENT, GROUP, and INDEX.
Interaction: For this option to take effect, the TIP= option must also be used.
This option provides a way to add to the data columns that appear in tooltips specified by the TIP= option.
ROWVARS = (column-list)
specifies a secondary list of columns to be paired with the required column list that is specified by the SCATTERPLOTMATRIX statement.
Default: no default
Interaction: When this option is specified, the DIAGONAL= option is ignored.
The labels for the variables appear vertically on the left side of the matrix.
START=TOPLEFT | BOTTOMLEFT
specifies whether to start populating the matrix from the top left or bottom left corner.
Default: TOPLEFT
TIP=(role-list)
specifies the information to display when the cursor is positioned over the scatter points. If this option is used, it replaces all the information displayed by default. Roles for columns that do not contribute to the scatter plot can be specified along with roles that do.
Default: The columns assigned to these roles are automatically included in the tooltip information: current X, current Y, DATALABEL, MARKERCHARACTER, MARKERCOLORGRADIENT, and GROUP.
(role-list)
an ordered, blank-separated list of unique SCATTERPLOTMATRIX and user-defined roles. SCATTERPLOT roles include: X, Y, GROUP, DATALABEL, MARKERCHARACTER, and MARKERCOLORGRADIENT.
User-defined roles are defined with the ROLENAME= option.
The following example displays tooltips for the columns assigned to the roles TIP1, TIP2, TIP3, and TIP4.
ROLENAME=(TIP1=ID TIP2=AGE TIP3=HEIGHT TIP4=WEIGHT)
TIP= (TIP1 TIP2 TIP3 TIP4)
Requirement: To generate tooltips, you must include an ODS GRAPHICS ON statement that has the IMAGEMAP option specified, and write the graphs to the ODS HTML destination.
Interaction: The labels and formats for the TIP variables can be controlled with the TIPLABEL= and TIPFORMAT= options.
TIPFORMAT=(role-format-list)
specifies display formats for tip columns.
Default: The column format of the variable assigned to the role or BEST6. if no format is assigned to a numeric column.
(role-format-list)
a list of rolename = format pairs separated by blanks.
ROLENAME=(TIP1=ID TIP2=AGE TIP3=HEIGHT TIP4=WEIGHT)
TIP=(TIP1 TIP2 TIP3 TIP4)
TIPFORMAT=(TIP3= 4.1) 
Requirement: Columns must be assigned to the roles for this option to have any effect. See the ROLENAME= option.
This option provides a way to control the formats of columns that appear in tooltips. Only the roles that appear in the TIP= option are used.
TIPLABEL=(role-label-list)
specifies display labels for tip columns.
Default: The column label or column name of the variable assigned to the role.
(role-label-list)
a list of rolename = "string" pairs separated by blanks.
ROLENAME=(TIP1=ID TIP2=AGE TIP3=HEIGHT TIP4=WEIGHT) 
TIP=(TIP1 TIP2 TIP3 TIP4)
TIPLABEL=(TIP3="Height in Inches"
TIP4="Weight in Pounds")
Requirement: Columns must be assigned to the roles for this option to have any effect. See the ROLENAME= option.
This option provides a way to control the labels of columns that appear in tooltips. Only the roles that appear in the TIP= option are used.
WALLCOLOR=style-reference | color
specifies the fill color of the plot wall area.
Default: The GraphWalls:Color style reference.
style-reference
a reference in the form style-element:style-attribute. Only the style- attribute named COLOR is used.
Interaction: This option is ignored if WALLDISPLAY=NONE or WALLDISPLAY=(OUTLINE).
WALLDISPLAY=STANDARD | ALL | NONE | (display-options)
specifies whether the plot’s wall and wall outline are displayed.
Default: STANDARD
STANDARD
displays a filled wall. The setting of the FRAMEBORDER= ON | OFF attribute of the GraphWalls style element determines whether the wall outline is displayed.
ALL
displays a filled, outlined wall.
NONE
displays no wall, no wall outline.
(display-options)
These options must include one of the following:
OUTLINE displays the wall outline.
FILL displays a filled wall area.
Use the WALLCOLOR= option to control the fill color of the wall.
The appearance attributes of the wall outline are set by the GraphAxisLine style element.