|The GMAP Procedure|
Map data sets and response data sets are used in the GMAP procedure. These data sets must contain the required variables or the procedure stops and you get an error message. The GMAP procedure can take as input a map data set and a response data set, provided that both data sets contain the same ID variable. Alternatively, you can use a single data set as input if it contains either the map data or a variable that references a map data set.
|About Map Data Sets|
There are two types of data sets that are provided with SAS/GRAPH for mapping: traditional map data sets and feature tables. Much of the map data that is delivered with SAS/GRAPH is available in both the traditional map data set and feature table formats.
SAS/GRAPH software includes a number of predefined map data sets. These data sets are described in The METAMAPS Data Set.
|About Traditional Data Sets|
a numeric variable named X that contains the horizontal coordinates of the boundary points. The value of this variable could be either projected or unprojected. If unprojected, X represents longitude.
The X and Y variable values in the traditional map data set do not have to be in any specific units. They are rescaled by the GMAP procedure based on the minimum and maximum values in the data set. The minimum X and Y values are in the lower-left corner of the map, and the maximum X and Y values are in the upper-right corner.
Map data sets in which the X and Y variables contain longitude and latitude should be projected before you use them with PROC GMAP. See The GPROJECT Procedure for details.
The traditional map data set can also contain an optional variable named SEGMENT to identify map areas that comprise noncontiguous polygons. Each unique value of the SEGMENT variable within a single map area defines a distinct polygon. If the SEGMENT variable is not present, each map area is drawn as a separate closed polygon that indicates a single segment.
The observations for each segment of a map area in the map data set must occur in the order in which the points are to be joined. The GMAP procedure forms map area outlines by connecting the boundary points of each segment in the order in which they appear in the data set, eventually joining the last point to the first point to complete the polygon. All the segments for each ID value must be contiguous within the map data set.
In addition to the variables described in Required Variables, some of the SAS/GRAPH map data sets also contain the following variables:
Rename the LONG and LAT variables to X and Y.
Project the coordinates by using the GPROJECT procedure.
Use the output data set from GPROJECT as your map data set.
Most of the traditional map data sets that are provided with SAS/GRAPH software contain four coordinate variables (X, Y, LONG, and LAT). In that case, X and Y are always projected values that are used by the SAS/GRAPH procedures (by default). If you need to use the unprojected values that are contained in the LONG and LAT variables, then do the following tasks:
The MAP= value in the GMAP procedure automatically uses X and Y. See Input Map Data Sets that Contain Both Projected and Unprojected Values for more details.
The traditional map data sets that contain X and Y variables (and no LONG and LAT variables), are usually projected maps. However, there are a few traditional map data sets for the US and Canada that contain X and Y values that are unprojected longitude and latitude. In this case, you need to use the GPROJECT procedure to project the map (see The GPROJECT Procedure).
Note: You can determine whether a SAS traditional map data set is projected or unprojected by looking at the description of each variable that is displayed when you use the CONTENTS procedure or by browsing the MAPS.METAMAPS data set.
|About Feature Tables|
An alternative to using the traditional map data set is the feature table. While the traditional map data set stores the spatial information across multiple observations, the feature table uses a data arrangement to store a reference to the spatial information in a single variable value. The feature table's data arrangement uses the $GEOREF SAS/GRAPH format.
The $GEOREF format stores spatial information in binary data streams, making it possible to store as a single variable value all the information needed to draw a map area. Thus, the feature tables use only a single observation for each map area, and they treat a field of spatial information just like any other information that can be added to a data set. Each $GEOREF value contains the name of the map data set and the ID variable for that map. The traditional map data set associated with the feature table must be located in the SAS library with the feature table for GMAP to proceed correctly.
The names of the feature tables that are supplied by SAS usually end with the number 2. For example, the feature table for MAPS.AFRICA is MAPS.AFRICA2. You can also determine the feature table for your map data set by referring to the MAPS.METAMAPS data set.
To locate the variable that contains the spatial information, run PROC CONTENTS on a feature table. In the Output window, the variable containing the spatial information has $GEOREF as the value in the column labeled Format.
Note: Some feature tables, like MAPS.CANCENS, have more than one $GEOREF format variable.
First, a PROC SORT must be used to sort the response and feature tables by a variable that is present within both the data sets. Once sorted, the data sets can then be merged with an SQL or DATA step MERGE with the BY variable being the variable used to sort the data sets. Once the data set is merged, the $GEOREF formatted variable from the feature table becomes the new data set's identification variable to be used in the GMAP procedure. See Creating a Map Using the Feature Table for more details.
|The METAMAPS Data Set|
In the MAPS library, there is a data set named METAMAPS, which contains metadata about all of the data sets that are delivered in the library. Among the metadata in MAPS.METAMAPS are the following four variables, which you can use to determine which feature table corresponds to a particular geometry table:
Identifies the names of all of the data sets that are delivered in the MAPS library.
Indicates whether a data set represents a feature table (F) or a geometry table (G).
Indicates the corresponding feature table for a geometry table. This variable is blank for rows that contain metadata about a feature table.
Indicates the variable, in the feature table, whose values encapsulate the geometry object.
For example, consider the data sets MAPS.ASIA, MAPS.STATES, and MAPS.US. Each of these represents a geometry table, and to locate the corresponding feature tables, you would look in MAPS.METAMAPS to find the MEMNAME values ASIA, STATES, and US. Here are the relevant values on those rows:
From these values, you can see that the data sets that are named ASIA, STATES, and US all represent geometry tables because their MEMCODE values are G. The feature table corresponding to the ASIA data set is the data set ASIA2, which stores the spatial information in the variable CONT95_GEO. The feature tables corresponding to STATES and US are both in the data set US2. The spatial information corresponding to STATES is stored in the variable GEO_STATE, and the spatial information corresponding to US is stored in the variable _MAP_GEOMETRY_.
|Special Data Sets for Annotating Maps|
contains the coordinates of the visual center of each state in the U.S. and Washington, D.C., as well as coordinates in the ocean for states that are too small to contain a label. There are two pairs of variables for locating labels using Annotate data sets. The X and Y variables are projected and can be used with the MAPS.US and MAPS.USCOUNTY data sets. The LONG and LAT variables are unprojected longitude and latitude in radians and can be used with the MAPS.STATES, MAPS.COUNTIES, and MAPS.COUNTY data sets.
contains the locations of selected cities in the U.S. Many city names occur in more than one state, so you might have to subset by state to avoid duplication. There are two pairs of variables for locating labels using Annotate data sets. The X and Y variables contain projected coordinates and can be used with the MAPS.US and MAPS.USCOUNTY data sets. The LONG and LAT variables contain the unprojected longitude and latitude in radians. These can be used to place labels on the MAPS.STATES, MAPS.COUNTIES, or MAPS.COUNTY data sets.
|About Response Data Sets|
The traditional map data set and the response data set must be used independently in the PROC GMAP statement, where the response data set is specified by the DATA= option and the traditional map data set is specified by the MAP= option. The values of the map area ID variables in the response data set determine the map areas to be included on the map. Unless the ALL option is used in the PROC GMAP statement, only the map areas with response values are shown on the map. As a result, you do not need to subset your map data set if you are mapping only a small section of the map. However, if you map the same small section frequently, then create a subset of the map data set for efficiency.
To use data from both a feature table and response data set, merge the two data sets by using a variable that is contained in both data sets. The new combined data set becomes the DATA= value in the PROC GMAP statement. When the response data set and the feature table are merged into one, do not use MAP=map-data-set in the PROC GMAP statement. The $GEOREF formatted variable is the ID variable for the combined data set. See Creating a Map Using the Feature Table for more details.
Note: Response data that does not correspond to a map feature is included in the legend.
Response levels are the values that identify categories of data on the graph. The categories that are shown on the graph are based on the values of the response variable. Based on the type of the response variable, a response level can be determined by any of the following:
When response levels are determined by a character value, the GMAP procedure treats each unique value as a response level. For example, if the response variable contains the names of ten regions, each region is a response level, resulting in ten response levels.
When response levels are determined by specific numeric values, and the DISCRETE option is specified, one level is created for each value. If the response variable has an associated format, then each formatted value is represented by a different response level.
The AREA, BLOCK, CHORO, and PRISM statements assign patterns to response levels. In CHORO and PRISM maps, response levels are shown as map areas. However, in BLOCK maps, response levels are shown as blocks. If you specify the AREA statement on a BLOCK map, then the response levels for AREA variable are shown as map areas. The default fill pattern for the response level is solid.
PATTERN statements can define the fill patterns and colors for both blocks and map areas. PATTERN definitions that define valid block patterns are applied to the blocks (response levels), and PATTERN definitions that define valid map patterns are applied to map areas.
See PATTERN Statement for more information on fill pattern values and default pattern rotation.
|About Identification Variables|
For traditional map data sets and response data sets, id-variables identify the map areas (for example, counties, states, or provinces) that make up the map. A unit area or map area is a group of observations with the same ID value. The GMAP procedure matches the value of the response variables for each map area in the response data set to the corresponding map area in the traditional map data set in order to create the output graphs.
With feature tables, the geo-variable, or $GEOREF formatted variable containing the spatial information, is the identification variable. Each observation in a feature table has a unique $GEOREF formatted variable value. When merging the feature table with the response data set using an SQL or DATA step statement, the identification variable can be any variable that is contained within both data sets. Once the merged data set has been created, the geo-variable is used in the PROC GMAP ID statement for the merged feature table and response data set. See Creating a Map Using the Feature Table for more details.
|Displaying Map Areas and Response Data|
Whether the GMAP procedure draws a map area and whether it displays patterns for response values depends on the contents of the response data set and on the ALL and MISSING options. The following table describes the conditions under which the procedure does or does not display map areas and response data.
|If the response data set . . .||And if . . .||Then the procedure . . .|
|includes the map area||the map area has a response value||draws the map area and displays the response data|
|includes the map area||the response value for the map area is a missing value||draws the map area but leaves it empty|
|includes the map area||the response value for the map area is a missing value and the MISSING option is used in the map statement||draws the map area and displays a response level for the missing value|
|does not include the map area||the ALL option is used in the PROC GMAP statement||draws the map area but leaves it empty|
|does not include the map area||the ALL option is not used||does not draw the map area|
|Summary of Use|
If using a traditional map data set, determine what processing needs to be done to the map data set before it is displayed. Use the GPROJECT, GREDUCE, and GREMOVE procedures or a DATA step to perform the necessary processing.
If using a feature table, use PROC SORT to individually sort the feature table and response data set by a variable common to both data sets. Next, use SQL or the DATA step MERGE to merge the feature table with the response data set by using a variable common to both data sets. Use the combined data set as the DATA= value in the PROC GMAP statement (do not include MAP= in the PROC GMAP statement).
|Accessing SAS Maps Online|
After downloading and unzipping map data sets, you must take them out of transport format by running the CIMPORT procedure using your current version of SAS. For more information, see Transporting and Converting Graphics Output.
|Importing Maps from ESRI Shapefiles|
You can import ESRI shapefiles as traditional map data sets by using the MAPIMPORT procedure. Depending on the type of coordinates that are in your shapefile, you might want to perform additional processing. For example, you might want to project the map with the GPROJECT procedure, or use the GREDUCE procedure to create a DENSITY variable for reducing your data.
For more information, see The MAPIMPORT Procedure.