GMAP Procedure

CHORO Statement

Creates two-dimensional maps in which values of the specified response variables are represented by varying patterns and colors.
Requirement: At least one response variable is required. The ID statement must be used in conjunction with the CHORO statement
Global statements: FOOTNOTE , NOTE, LEGEND, PATTERN, TITLE

Syntax

CHORO response-variable(s) </ option(s)>;

Summary of Optional Arguments

Appearance options
specifies a data set to annotate onto maps that are produced by the CHORO statement.
fills empty map areas in the specified color.
outlines empty map areas in the specified color.
outlines non-empty map areas in the specified color.
stretches map extents to cover all available space in the device.
causes the same legend and coloring to be used for all maps produced by the procedure instead of being calculated within each BY group for each map.
specifies the width of all map area outlines, in pixels.
specify the physical dimensions of the map.
Description options
specifies a description of the output.
specifies the name of the GRSEG catalog entry and the name of the graphics output file, if one is created.
Legend options
specifies a color for the text in the legend.
assigns the specified LEGEND statement that is to be applied to the map.
suppresses the legend.
Mapping options
generates a separate response level (color and surface pattern) for each different value of the formatted response variable.
specifies the number of response levels to be graphed for the response variable.
specifies the response levels for the range of response values that are represented by each level (pattern and color combination).
accepts a missing value as a valid level for the response variable.
causes GMAP to collect all response values (or their statistic) and chart each region as a percentage of the whole.
displays value ranges in the legend.
overrides the GMAP default format for percent of PERCENT8.2.
specifies the statistic for GMAP to chart.
ODS options
identifies the variable in the input data set whose values create links.
identifies the variable in the input data set whose values create links or data tips or both.
specifies a character variable whose values are URLs.

Required Argument

response-variable(s)
specifies one or more variables in the response data set. Each response variable produces a separate map. All variables must be in the input data set. Multiple response variables are separated with blanks.
Missing values for the response variable are not considered valid response values unless you use the MISSING option in the CHORO statement.
Response variables can be either numeric or character in type. Numeric response variables are normally grouped into ranges, or response levels, as determined by default, or by the MIDPOINTS= or LEVELS=number-of-response-levels options. Each response level is assigned a different combination of pattern and color. With the LEVELS=ALL option, numeric or character response variables are assigned unique response levels, as are numeric variables when the DISCRETE option is specified. The LEVELS=number-of-response-levels option is ignored when either the DISCRETE or the MIDPOINTS= option is used.

Optional Arguments

Options in a CHORO statement affect all graphs that are produced by that statement. You can specify as many options as you want and list them in any order.

ANNOTATE=Annotate-data-set
specifies a data set to annotate onto maps that are produced by the CHORO statement.
CDEFAULT=empty-area-fill-color
fills empty map areas in the specified color. This option affects only map areas that are empty. Empty map areas are generated in choro maps only when there is no response value for a map area and the MISSING option is not used. They are also generated when a map area is omitted from the response data set and the ALL option is included in the PROC GMAP statement.
The default is NONE, which draws the polygon empty, showing the background in the fill area of the polygon.
Alias:CDEF=, DEFCLR=
Restriction:Not supported by Java
See:The CEMPTY option, the ALL, and Displaying Map Areas and Response Data
CEMPTY=empty-area-outline-color
outlines empty map areas in the specified color. This option affects only the empty map areas, which are generated in choro maps when either of the following is true:
  • There is no response value for a map area and the MISSING option is not used.
  • A map area is omitted from the response data set and the ALL option is included in the PROC GMAP statement.
The default outline color is the same as the default COUTLINE= color.
Alias:CE=
Restriction:Not supported by Java
COUTLINE=area-outline-color | SAME
outlines non-empty map areas in the specified color. When COUTLINE=area-outline-color and DEVICE=JAVA or ACTIVEX, both empty and non-empty map areas are outlined. The value SAME specifies that the outline color of a map area is the same as the interior pattern color.
The default outline color is determined by the current style. If you specified the NOGSTYLE system option, then the default color is black for Java and ActiveX and the first color in the color list for all other devices.
Alias:CO=
Note:If you specify empty map patterns (VALUE=EMPTY in a PATTERN statement), then you should not change the outline color from the default value SAME to a single color. Otherwise, all the outlines are one color and you cannot distinguish between the empty areas.
CTEXT=text-color
specifies a color for the text in the legend. If you omit the CTEXT= option, a color specification is searched for in this order:
  • the CTEXT= option in a GOPTIONS statement.
  • the default, the text color that is specified in the current style.
  • If you specified the NOGSTYLE system option, then the default color is black for Java and ActiveX and the first color in the color list for all other devices.
The CTEXT= color specification is overridden if you also use the COLOR= suboption of a LABEL= or VALUE= option in a LEGEND definition that is assigned to the map legend. The COLOR= suboption determines the color of the legend label or the color of the legend value descriptions, respectively.
Alias:CT=
DESCRIPTION='description'
specifies a description of the output. The maximum length for description is 256 characters. The description does not appear in the output. The descriptive text is shown in each of the following:
  • the chart description for Web output (depending on the device driver). See Chart Descriptions for Web Presentations for more information.
  • the Table of Contents that is generated when you use the CONTENTS= option on an ODS HTML statement, assuming that the output is generated while the contents page is open.
  • the description and the properties for the output in the Results window.
  • the description and properties for the catalog entry in the Explorer.
  • the Description field of the PROC GREPLAY window.
The description can include the #BYLINE, #BYVAL, and #BYVAR substitution options, which work as they do when used on TITLE, FOOTNOTE, and NOTE statements. Refer to Substituting BY Line Values in a Text String. The 256-character limit applies before the substitution takes place for these options. Thus, if in the SAS program the entry-description text exceeds 256 characters, it is truncated to 256 characters, and then the substitution is performed.
Alias:DES=
Default:CHOROPLETH MAP OF variable-name
DISCRETE
generates a separate response level (color and surface pattern) for each different value of the formatted response variable. The LEVELS=number-of-response-levels option is ignored when you use the DISCRETE option.
If you specify the DISCRETE option, then distinct, non-continuous colors are used for the response values. If you specify the LEVELS= option, then a color ramp is used to assign each response value a continuous color scheme.
Note:If the data does not contain a value in a particular range of the format, that formatted range is not displayed in the legend.
HTML=variable
identifies the variable in the input data set whose values create links or data tips or both. The variable values are either links or data tips or both that are created in the HTML file generated by the ODS statement. The links are URLs pointing to Web pages to display when the user clicks (drills down) on elements in the graph. Data tips are detailed information or data values that are displayed as pop-up text when a mouse pointer is positioned over elements in the graph.
HTML_LEGEND=variable
identifies the variable in the input data set whose values create links. Input data set variable values create links that are associated with a legend value and point to the URL to display when the user clicks (drills down) on the value.
Restriction:Not supported by Java and ActiveX
LEGEND=LEGEND<1...99>
assigns the specified LEGEND statement that is to be applied to the map. The LEGEND= option is ignored if the specified LEGEND definition is not currently in effect. In the GMAP procedure, the CHORO statement produces a legend by default unless you specify the NOLEGEND option. If you use the SHAPE= option in a LEGEND statement, then only the value BAR is valid. Most of the LEGEND options described in LEGEND Statement are supported by both Java and ActiveX. If a LEGEND option is not supported by Java or ActiveX, it is noted in the LEGEND option definition.
Restriction:Partially supported by Java and ActiveX
LEVELS=number-of-response-levels | ALL
specifies the number of response levels to be graphed for the response variable. If you specify LEVELS=ALL , then all unique numeric or character response variable values are graphed.
Each response level is assigned a different surface pattern and color combination.
If you specify the LEVELS= option, then a color ramp is used to assign each response value a continuous color scheme. The response values are assigned lighter and darker values of a color scheme to express lower and higher response values. If you specify the DISCRETE option, then distinct, non-continuous colors are used for the response values.
If neither the LEVELS= option nor the DISCRETE option is used, then the GMAP procedure determines the number of response levels by using the formula FLOOR(1+3.3 log(n)), where n is the number of response variable values.
By default, an equal-distribution (quantizing) algorithm is used to determine each level.
When MIDPOINTS=OLD is used with the LEVELS= option, default midpoints are generated using the Nelder algorithm (Applied Statistics 25:94–7, 1976).
Restriction:The LEVELS=number-of-response-levels option is ignored when you use the DISCRETE or MIDPOINTS=value-list option. It is also ignored when the response variables are character.
Note:If you specified the NOGSTYLE system option, then noncontinuous colors are used by default.
MIDPOINTS=value-list | OLD
specifies the response levels for the range of response values that are represented by each level (pattern and color combination).
For numeric response variables, the value-list argument is either an explicit list of values, a starting and an ending value with an interval increment, or a combination of both forms:
  • n <...n>
  • n TO n <BY increment >
  • n <...n> TO n <BY increment > n <...n>
By default the increment value is 1. You can specify discrete numeric values in any order. In all forms, n can be separated by blanks or commas. For example: midpoints=(2 4 6) midpoints=(2,4,6) midpoints=(2 to 10 by 2)
If a numeric variable has an associated format, the specified values must be the unformatted values. With numeric response values, DEVICE=JAVA uses only midpoints that fall in the range of the data being used. Thus, if your data ranged from 30–80, but midpoints were specified at 25, 50, 75,and 100, only 50 and 75 are used.
For character response variables, value-list is a list of unique character values enclosed in quotation marks and separated by blanks:
  • 'value-1' <...'value-n'>
The values are character strings enclosed in single quotation marks and separated by blanks. For example: midpoints="Midwest" "Northeast" "Northwest"
Specify the values in any order. If a character variable has an associated format, the specified values must be the formatted values. Character response values specified with the MIDPOINTS= option are not supported by DEVICE=JAVA.
You can selectively exclude some response variable values from the map, as shown here: midpoints="Midwest"
The only observations that are shown on the map are those observations for which the response variable exactly matches one of the values that are listed in the MIDPOINTS= option. As a result, observations might be excluded inadvertently if values in the list are misspelled or if the case does not match exactly.
Specifying MIDPOINTS=OLD generates default midpoints using the Nelder algorithm (Applied Statistics 25:94–7, 1976).
Restriction:Partially supported by Java
See:The RANGE option
MISSING
accepts a missing value as a valid level for the response variable.
NAME='name'
specifies the name of the GRSEG catalog entry and the name of the graphics output file, if one is created. The name can be up to 256 characters long, but the GRSEG name is truncated to eight characters. Uppercase characters are converted to lowercase, and periods are converted to underscores. The default GRSEG name is GMAP. If the name duplicates an existing name, then SAS/GRAPH adds a number to the name to create a unique name (for example, GMAP1).
If the name specified is exactly eight characters long, then the last character of the image output filename is replaced with a number. For example, myimages is changed to myimage1.
NOLEGEND
suppresses the legend.
PERCENT
causes GMAP to collect all response values (or their statistic) and chart each region as a percentage of the whole. You can use the STATISTIC= option to change how the percentage is calculated—whether as a percentage of the SUM, FREQUENCY, or MEAN. If you do not use the STATISTIC= option, then STATISTIC=FIRST is assumed—the response variable of only the first observation of each region is counted. If the response variable is a text field, then STATISTIC=FREQUENCY is used, even if you specify a different value for the STATISTIC= option.
RANGE
causes GMAP to display, in the legend, the starting value and ending value of the range around each midpoint specified with the MIDPOINTS= option (instead of displaying just the midpoints). For example, if MIDPOINTS=15 25 35, then the legend could show 10-20, 20-30, 30-40.
Restrictions:The MIDPOINTS= option must be specified for the RANGE option to have any effect.

Not supported by ActiveX

STATFMT=format-specification
overrides the GMAP default format for percent of PERCENT8.2. Use this format when using calculated values. The STATFMT option is typically used when the STATISTIC=FREQUENCY option or the PERCENT option is used.
Alias:SFMT=, SFORMAT=, STATFORMAT=
STATISTIC=FIRST | SUM | FREQUENCY | MEAN
specifies the statistic for GMAP to chart. For character variables, FREQUENCY is the only allowed value—any other value is changed to FREQUENCY and a warning is issued. The frequency of a variable does not include missing values unless the MISSING option is specified.
FIRST
GMAP matches the first observation from the DATA= data set and charts the response value from this observation only. This is the default. If more rows exist that are not processed, a warning is issued to the log.
SUM
All observations matching a given ID value are added together and the summed value is charted.
FREQUENCY
A count of all rows with nonmissing values is charted unless you specify the MISSING option.
MEAN
All observations matching a given ID value are added together and then divided by the number of nonmissing observations matched. This value is then charted unless you specify the MISSING option.
Alias:STAT=
STRETCH
stretches map extents to cover all available space in the device. This might cause the map to be distorted. When this option is applied to the PROC GMAP statement, it applies to all statements. If applied to a single statement, it applies only to that statement.
Alias:STRETCHTOFIT, STR2FIT
Restriction:Not supported by Java and ActiveX
UNIFORM
causes the same legend and coloring to be used for all maps produced by the procedure instead of being calculated within each BY group for each map. The UNIFORM option prescans the data to generate a categorization across all the data, regardless of BY grouping, and applies that categorization to all maps in the BY group. This results in a static legend and color distribution across all maps such that a single value always has the same color in multiple maps.
When specified on a PROC GMAP statement, UNIFORM applies to all AREA, BLOCK, CHORO, and PRISM statements included within the GMAP run-group.
When omitted from the PROC GMAP statement, and specified on an individual AREA, BLOCK, CHORO, or PRISM statement, UNIFORM applies only to the maps produced by that statement.
Restriction:Not supported by Java
URL=character-variable
specifies a character variable whose values are URLs. The variable values are URLs for Web pages to display when the user clicks (drills down) on elements in the graph. The variable values are URLs for Web pages to display when the user clicks (drills down) on elements in the graph.
Restriction: This option affects graphics output that is created through the ODS HTML destination only.
Interaction:If you specify both the HTML= and the URL= options, then the URL= option is ignored

Example: GIF Output with Drill-Down Links

WOUTLINE=area-outline-width
specifies the width of all map area outlines, in pixels.
Default:1
XSIZE=map-width <units>
YSIZE=map-height <units>
specify the physical dimensions of the map. By default, the map uses the entire procedure output area.
Valid units are CELLS (character cells), CM (centimeters), IN (inches), or PCT (percentage of the graphics output area). The default unit is CELLS.
If you specify values for units that are greater than the dimensions of the procedure output area, the map is drawn using the default size.
If you specify only one of the XSIZE= or YSIZE= options, the GMAP procedure scales the dimension for the unspecified option in order to retain the original shape of the map.
Restriction:Not supported by Java and ActiveX

Details

Description

The CHORO statement specifies the variable or variables that contain the data represented on the map by patterns that fill the map areas. This statement automatically
  • determines the midpoints
  • assigns patterns to the map areas
You can use statement options to enhance the appearance of the map (for example, by selecting the colors and patterns that fill the map areas). Other statement options control the selection of ranges for the response variable.
In addition, you can use global statements to modify the map area patterns and legend, as well as add titles and footnotes to the map. You can also use an Annotate data set to enhance the map.