GMAP Procedure

Differences between GfK and Traditional Map Data Sets

This section covers the differences between the traditional and the GfK map data sets, and the benefits of using the GfK data. Usually, you cannot simply replace a traditional map data set in your existing code with the GfK map data set. You must carefully review the map libraries, the map data set filenames, and the variables that they contain. There are also notable projection differences. For example, that the X and Y variables are always projected in the GfK map data sets. All the details are described next.

New License Information

SAS supplies both traditional and GfK map data sets. SAS has licensed the vector-based map data sets representing the world from GfK GeoMarketing GmbH. These map data sets are to be used only with SAS/GRAPH for your internal business purposes.
Anyone with specialized map needs can license map data directly from GfK GeoMarketing GmbH. They can be assured that their map data will match up with the map data provided with SAS/GRAPH.
The map data sets in library MAPSGFK are based on the digital, vector-based maps from GfK GeoMarketing GmbH and are covered by their copyright. For additional information, see http://support.sas.com/mapsonline/gfklicense.
Libraries MAPSSAS and MAPS both contain the updated traditional maps data sets supplied by SAS.

Advantages to Using GfK Map Data Sets

There are some key advantages to using GfK instead of traditional map data sets, including:
Consistency Licensing the map data from GfK GeoMarketing GmbH provides a single source for map data. This single source ensures that the map data is accurate and uniform for the entire world. Additional map data obtained from GfK GeoMarketing GmbH will match up seamlessly with SAS/GRAPH map data.
Single-source updates GfK GeoMarketing GmbH will be solely responsible for all updates and changes to their map data. This includes political boundary updates, which were up to this time hard to obtain. SAS offers the ability to download the updates to map data via SAS Maps Online (http://support.sas.com/rnd/datavisualization/mapsonline/index.html).
Ease-of-Use SAS converts the GfK map data into a SAS map data set format to avoid unnecessary special processing.

New Library Names

Two new libraries are available with the second maintenance release of SAS/GRAPH:
  • MAPSGFK points to map data sets based on the digital maps from GfK GeoMarketing, the single source for this map data.
  • MAPSSAS points to the same updated traditional map data sets as the MAPS libref.
Both the library reference (libref) MAPSGFK and MAPSSAS are set during system configuration and cannot be changed. Use the MAPS= system option within SAS to point to either the GfK map data set library MAPSGFK or the traditional map data set library MAPSSAS. MAPS points to the MAPSSAS library by default.

New Map Data Set Names

This section describes the differences found when comparing the GfK map data set names to the traditional map data set names.
Longer names
GfK map data sets have longer names than the traditional map data sets. That is because they are not truncated. The names can exceed eight characters. For example, compare the traditional map data set name of AFGHANIS to the GFK map data set name of AFGHANISTAN.
Consistent naming convention
The GfK map data sets also provide a consistent naming convention. For example, compare the traditional map data set name of STATES, which could correlate to any country with states, to the GFK map data set name of US_STATES.
Another notable difference is the GfK map data set names of NORTH_KOREA and SOUTH_KOREA versus the traditional map data set names of KOREANOR and KOREASOU. GfK map data set names correspond to the common use names in language.
New data sets
GfK includes new map data sets such as CAYMAN_ISLANDS and the Nomenclature of Territorial Units for Statistics (NUTS) level 0,1,2,3 data sets for Europe. The NUTS classification is a hierarchical system for dividing up the economic territory of the European Union (EU). An example is the data set named EUROPENUTS3.
Data set designations
Same-named data sets differentiated with numeric qualifiers, such as EUROPE, EUROPE1, and EUROPE2 indicate the level of administrative detail. For example, EUROPE1 indicates countries on the European continent with 1st administrative level – similar to US states.
The level 1 continent map data sets do not contain all of the corresponding countries that are found in the MAPSGFK.WORLD data set. However, a level 0 continent map data set does contain all of its corresponding countries (for example, MAPSGFK.EUROPENUTS0).
New data set files with additional variables
Each GfK map data set now has a companion data set with a _ATTR qualifier. For example, the data set AFRICA_ATTR contains extra variables.
New data set files listing all dependencies
GfK data sets, where applicable, include an _ALL qualifier. These map data sets contain all the territories and islands. For example, compare the traditional map data set US that contains Puerto Rico with the GfK map data set of the same name, which does not contain Puerto Rico. You must use the GfK map data set US_ALL to include Puerto Rico.
Disputed territories
In the WORLD map data set (MAPSGFK.WORLD), the disputed territories are not included with individual countries but rather are identified with the variable ISOALPHA2 and a value of NN. They are also identified by the ID variable values that differ from the ISOAPLPHA2 variable value. Take for example the ID values of the disputed territories between the following countries or states:
  • Suriname and Guyana has an ID value of SR_GY
  • China and Taiwan has ID value of CN_TW
  • China and India has ID value of CN_IN
  • Cameroon and Nigeria has ID value of CM_NG
Note: Compared to MAPSGFK.WORLD map data set, the ID andISOALPHA2 variables in MAPSSGFK.WORLD_CITIES are the same.
New file PROJPARM
SAS provides this file that contains the Procedure GPROJECT information for all GfK map data sets.

Correlations between Map Data Sets and Data Sets

The information from some traditional map data sets has been combined in the GfK map data sets. Other map data sets and data sets have remained the same. For example, the map data set US maintains the same name between traditional and GfK. The following table lists noteworthy correlations:
Notable Eliminated Map Data Sets and Data Sets
Traditional
GfK GeoMarketing
USCITY
Note: Includes towns, villages, hamlets, and other non-city areas that can have the same name as a city. A FEATYPE variable contains the area type.
USCITY
Note: Includes towns, villages, hamlets, and other non-city areas that can have the same name as a city. Does not include a FEATYPE variable.
USCITY_ALL
Note: Includes Puerto Rico and the U.S. Virgin Islands.
COUNTIES
US_COUNTIES
COUNTY
US_COUNTIES
USCOUNTY
US_COUNTIES
CNTYNAME
US_COUNTIES_ATTR
STATES
US_STATES
US
US
US2
US_STATES_ATTR
USCENTER
USCENTER
Note: Includes Washington, D.C.
USCENTER_ALL
Note: Includes Puerto Rico and the U.S. Virgin Islands.
All data set names beginning with USA
None
Note: The map data that was contained in traditional map data sets COUNTIES, COUNTY, and USCOUNTY is now combined in the one map data set US_COUNTIES.
Traditional map data set names with a suffix of the numeral 2 do not exist in the MAPSGFK library, with the exception of EUROPE2. Look instead for data set names with a suffix of 1. For example, the traditional map data set ASIA2 has a GfK map data set name of ASIA1.
Run the following code to determine whether the map data set name that you are currently using is listed in the MAPSGFK library:
proc datasets lib=mapsgfk;
run;

New and Changed Variables

The following table describes new variables and compares the changed variables between traditional and GfK map data sets:
Variables Specific to GfK Map Data Sets
Variable Name
Variable Description
Traditional Map Data Set Details
GfK Map Data Set Details
DENSITY
Contains the density values returned from a GREDUCE procedure.
May contain.
Does contain.
ID
A unique character code for a county or district in _ATTR map data sets.
May contain. (Numeric format)
May contain.
Character (Length 15 for all map data sets that contain this variable, regardless of the length of the value actually being stored)
IDNAME
A character county or district name in _ATTR map data sets.
Does not contain.
Does contain.
ID1, ID2
A character state or province code in _ATTR map data sets.
Does not contain.
Does contain.
ID1NAME
A character state or province name in _ATTR map data sets.
Does not contain.
Does contain.
ID1NAMEU
A Unicode character version of ID1NAME in _ATTR map data sets.
Does not contain.
Does contain.
ISO
A character country code in _ATTR map data sets.
Does not contain.
Does contain.
ISOALPHA2, ISOALPHA3
A character country International Organization for Standardization Alpha2– or Alpha3–code in _ATTR map data sets.
Note: ISOALPHA2 is used in the MAPSGFK.WORLD map data set with a value of NN to indicate a disputed territory.
Does not contain.
Does contain.
ISONAME
A character country International Organization for Standardization name in _ATTR map data sets.
Note: the ISONAME variable value found in a continent _ATTR map data set is identical to its counterpart in the MAPSGFK.WORLD map data set
Does not contain.
Does contain.
LAT
a numeric variable containing the vertical coordinate of the boundary point. (The value of this variable is unprojected and represents latitude (north-south position).
May contain. (The value of the variable is in radians or degrees.)
Does contain. (The value of the variable is in degrees.)
LONG
a numeric variable containing the horizontal coordinate of the boundary point. (The value of this variable is unprojected and represents longitude (east-west position).
May contain. (The value of the variable is in radians or degrees.)
Does contain. (The value of the variable is in degrees.)
RESOLUTION
A numeric map detail level from 1 to 10 that is based on desired output display resolution.
Does not contain.
Does contain.
STATE
State identification.
Does contain. (Numeric FIPS code.)
Does contain. (Character identification.)
X
a numeric variable that contains the horizontal coordinates of the boundary points.
Does contain. (The value can either be projected or unprojected. If unprojected, Y represents latitude.)
Does contain. (The value is always projected and represents longitude.)
Y
a numeric variable that contains the vertical coordinates of the boundary points
Does contain. (The value can either be projected or unprojected. If unprojected, Y represents latitude.)
Does contain. (The value is always projected and represents latitude.)

Eliminated Variables

The following table describes the traditional map data set variables that are not used in the GfK map data sets, as well as their replacement, where applicable, in the GfK map data sets:
Variable Name
Variable Description
Traditional Map Data Set Details
GfK Map Data Set Details
CDCODE
Census District.
Does contain. (Character format)
Does not contain.
CONT
Continent.
Does contain. (Numeric format)
Note: Used in all the continent and the WORLD map data sets.
Does contain. (Numeric format)
Note: Used in all the continent_ATTR, WORLD, and WORLD_ATTR map data sets.
COUNTRY
World Geographic Code.
May contain. (Numeric format)
Does not contain. Replaced by the ISO variable that represents the International Organization of Standardization’s country code. (Character format)
ID
Identification variable.
May contain. (Numeric format).
Does contain. (Character format)
PROVINCE
Character abbreviation for a province.
May contain. (Character format)
Does not contain. Replaced by the ID variable in character format in the _ATTR data sets.
GLC codes
Geographic Locator Codes (alphanumeric).
May contain.
Does not contain. Replaced by the ISO codes.

Data Changes

The data changes encompassed in the GfK map data sets include various variables. The changes are detailed in this section.
The RESOLUTION variable reduces the points for the display size of a map as indicated in the following table:
RESOLUTION Variable Values
RESOLUTION VALUE
NUMBER OF PIXELS
1
320 x 240
2
400 x 300
3
640 x 480
4
800 x 600
5
1280 x 1024
6
1600 x 1200
7
2400 x 1800
8
6000 x 4800
9
14400 x 11520
10
28800 x 23040
The traditional STATE variable is represented as ID1 in GfK map data sets such as US_COUNTIES. For example, a STATE value of 51 is represented by ID1 as 84051. ID1 concatenates the country code with the state code.
The traditional COUNTY variable is represented as ID in GfK map data sets such as US_COUNTIES. For example, a COUNTY value of 15 is represented by ID as 84051015. ID concatenates the country code, the state code, and the county code.
The X and Y variable values in the GfK map data sets are always projected.
The LAT and LONG variable values in the GfK map data sets are always unprojected degrees (not radians).
The identification variables, for example ID and ID1, are character instead of numeric in the GfK map data sets.
GfK map data is provided in UTF-8 Unicode encoding with the exception of the MAPSGFK.WORLD_CITIES map data set.
GfK map data contained in the map data sets are projected using appropriate projection methods. For example, the data set MAPSGFK.US represents a relatively small map area that is near neither pole. These factors make it possible to project data without distortion using the Albers method. The LABEL column in each data set should identify the projection algorithm used for the X and Y variables. Use the CONTENTS procedure to view columns in a map data set. To view columns in the map data set named MAPSGFK.US, run the following code:
proc contents data=mapsgfk.us;
run;

Compatibility Information

Substituting the GfK map data sets into an existing application that currently uses the traditional map data sets requires modifications to the application. Without these modifications, using the GfK map data could cause unexpected results. For example, the GfK map data set names should not be substituted into your existing code‘s DATA= statement without some additional considerations. Please refer to Migration Information for tips on converting your existing map-producing code.

Migration Information

When using the GfK map data sets, be aware of the following:
  • the map data set filenames can be different
  • the X and Y variables are always projected.
  • the unprojected LAT and LONG variables are in degrees and not radians.
  • the ID, ID1, and ID2 variables are character instead of numeric format.
  • A map area crossing the International Date Line will be divided into two segments in the displayed graph, indicating the boundaries on either side of the line. Examples of affected regions are Russia, Tonga, and territories and islands in the Pacific ocean.
Refer to Reworking Code That Uses Traditional Map Data Sets for information and tips.