GEOCODE Procedure

Understanding ZIP+4 Geocoding

Overview of ZIP+4 Geocoding

With ZIP+4 geocoding, the GEOCODE procedure attempts to match the five-digit ZIP code and ZIP+4 extension from your address data set with the lookup data set.
If a ZIP+4 code is not found, the GEOCODE procedure attempts to match the standard five-digit ZIP code. If that is not found, the procedure attempts to find the city. If you are interested in the ZIP code location only, you can turn off this behavior using the NOCITY option in the GEOCODE statement.

About ZIP+4 Lookup Data

For ZIP+4 code geocoding, you can use data that is derived from the TIGER ZIP+4 files that are available from the SAS Maps Online Web site. For more information, see SAS Maps Online Web Site.
Note: The Census Bureau has omitted ZIP+4 values from TIGER files released after 2006. The lookup data set to be used with the PLUS4 geocode method was created using 2006 TIGER data and is available on SAS Maps Online. The lookup data set will be updated after the Census Bureau reinstates ZIP+4 values into TIGER files. For more information about the status of the ZIP+4 data, and how to download the current data, see SAS Maps Online Web Site.
You can also purchase the GEO*Data product containing ZIP+4 centroids, available from Melissa Data at this Web site: www.melissadata.com/geocoder/geodata.htm.
SAS includes an autocall macro ( %GCDMEL9), which imports Geo*Data files into SAS data sets. You can modify this program to import other sources of data.
You can specify that non-geocoding variables from the lookup data set be added to the output data set by using the ATTRIBUTEVAR= option in the PROC GEOCODE statement.

About Alternate ZIP+4 Lookup Data

When you use ZIP+4 geocoding, you must specify an alternative lookup data set because SASHELP.ZIPCODE does not contain any ZIP+4 values. This data set must contain the following variables:
Default Name:
Description:
ZIP
Five-digit ZIP code
PLUS4
Four-digit ZIP+4 extension
X
Longitude of the central coordinate
Y
Latitude of the central coordinate
You can specify different names for the variables by using options in the PROC GEOCODE statement. For example, the LOOKUPPLUS4 option specifies the name of the ZIP+4 extension variable in the lookup data set.
The ZIP and PLUS4 variables can contain either character data or numeric data. The data type must match the type of the corresponding variable in your input data set.
Note: The character values in your input and lookup data sets do not need to be a case-sensitive match. Character value matching in the GEOCODE procedure is not case sensitive.
Additional non-geocoding attribute variables can also be in the alternate lookup data set. You can add these variables to the output data set by using the ATTRIBUTEVAR= option in the PROC GEOCODE statement.
You can obtain a lookup data set for ZIP+4 geocoding from the SAS Maps Online Web site at www.sas.com/mapsonline. On the Downloads page, select Geocoding to access the downloads that are related to geocoding.
An alternative source for ZIP+4 lookup data is the Geo*Data product from Melissa Data. You can use the %GCDMEL9 autocall macro to convert Geo*Data files to SAS data sets. For more information, see %GCDMEL9 Autocall Macro.

%GCDMEL9 Autocall Macro

Overview of the %GCDMEL9 Autocall Macro

The %GCDMEL9 autocall macro enables you to directly import Geo*Data files from Melissa Data as SAS data sets. Geo*Data files contain third-party ZIP+4 lookup data for use with PLUS4 geocoding.
Geo*Data files are available for each state. The files are provided as text files within compressed (ZIP) archives. Melissa Data also provides the PKUNZIP utility to extract the text files.
The %GCDMEL9 macro uses the following macro variables:
DATASETNAME
specifies the name of the output data set.
DATASETPATH
specifies the location where the output data set is created.
DATASETLABEL
(optional) specifies a label for the output data set.
LIBNAME
specifies the name for a new library that is assigned for the location that you specified in the DATASETPATH macro variable.
UNZIPPEDPATH
specifies the location of the extracted Geo*Data files that you want to import. The %GCDMEL9 macro attempts to read all of the text (.txt) files in this directory.
WORKPATH (Optional)
specifies the path where temporary files are written. The default path is the path for the WORK library.

Usage Example for the %GCDMEL9 Autocall Macro

In this example, a Geo*Data file for the state of Delaware (DE.txt) was extracted to C:\Mydata. The lookup data set is created in an existing directory C:\Geocode and assigned the libref ZIP4. The resulting data set is named ZIP4.DELAWARE.
The following code imports the data:
   /* Define macro variables */
   %let UNZIPPEDPATH=C:\Mydata;
   %let DATASETPATH=C:\Geocode;
   %let DATASETNAME=Delaware;
   %let LIBNAME=ZIP4;
   %let DATASETLABEL=ZIP+4 lookup data for Delaware;
   /* Submit autocall macro */
   %GCDMEL9;

Tips for ZIP+4 Geocoding

The following table contains suggestions and comments for the PLUS4 geocoding method.
Category
Suggestions and Comments
Most recent lookup data
Check the SAS Maps Online site to see whether there is an update available for the ZIP4 lookup data set. (It will be available when the Census Bureau restores ZIP+4 values to their TIGER/Line files.) See SAS Maps Online Web Site.
You can check with the third-party vendor, Melissa Data, to see whether its Geo*Data product contains ZIP+4 values that are more recent than 2006. The files can be imported with the %GCDMEL9 AUTOCALL macro.
Correct data
Here are some common reasons why ZIP+4 matches are not found:
  • The input ZIP+4 contains transposed digits.
  • The ZIP+4 is new and therefore is not in the lookup data set that you are using.