Geocode Data

Geocoding is the conversion of an address into a location. Various parts of an address can be used depending on how much precision is wanted in that location.

PROC GEOCODE

Geocoding is the conversion of an address into a location. Various parts of an address can be used depending on how much precision is wanted in that location.

Beginning with SAS 9.2 Phase 1 and continuing through SAS 9.4M4, PROC GEOCODE was a SAS/GRAPH procedure and required a SAS/GRAPH licensed installation.

Starting with SAS 9.4M5, the PROC GEOCODE is part of Base SAS and can be run without a SAS/GRAPH installation. Use the PROC GEOCODE for geocoding by street address, city, ZIP code, ZIP+4 and IP address.

Each geocoding method requires specific lookup data. Some of the lookup data is installed with SAS, some is available below, while other lookup data can be downloaded from government sources or data vendors.

See the SAS Help and documentation provided for details on using these lookup data sets with the PROC GEOCODE.

Street Geocoding

Street geocoding for the U.S. was added to PROC GEOCODE in SAS 9.2M3. U.S. lookup data is generated from Census Bureau TIGER/Line shapefiles.

Canadian street geocoding was added in SAS 9.4. Canadian lookup data is generated from GeoBase National Road Network (NRN) files. The Canadian data will not work with PROC GEOCODE releases prior to SAS 9.4.

The zipped files below contain prebuilt geocoding data files, a ReadMe.txt file with instructions, and a SAS program to import the CSV data files into data sets. Some of the zipped files are over 2 GB.

In addition to downloading prebuilt U.S. or Canadian data, SAS also provides programs that enable you to create lookup data from the original source. The programs allow you to download TIGER or NRN shapefiles for specific U.S. counties or Canadian provinces and create the lookup data for more limited regions.

City Geocoding

U.S. Cities

All versions of PROC GEOCODE support geocoding by U.S. city centroid. If you have SAS 9.4M4 or earlier, PROC GEOCODE was part of SAS/GRAPH. The default lookup data set for those releases is in the MAPSGFK library.

In SAS 9.4M5 and later, PROC GEOCODE moved to Base SAS. If you do not have SAS/GRAPH installed at your site, MAPSGFK is not available. You can use LOOKUPCITY=SASHELP.ZIPCODE in your PROC GEOCODE syntax to specify an alternate lookup data set for geocoding of U.S. cities.

World Cities

International city geocoding was added to PROC GEOCODE in SAS 9.3M2. If you are running SAS 9.3M2 through SAS 9.4M4, PROC GEOCODE is part of SAS/GRAPH and the MAPSGFK.WORLD_CITIES data set is installed as the default lookup data set. It is a subset of a much larger data set, WORLD_CITIES_ALL, which can be downloaded from the Misc. Updates page if you have a SAS/GRAPH license. See the LOOKUPCITY= option in the PROC GEOCODE documentation for details about specifying an alternate lookup data set.

In SAS 9.4M5 and later, PROC GEOCODE is part of Base SAS. However, you can use MAPSGFK.WORLD_CITIES or WORLD_CITIES_ALL as the lookup data set, only if you also have SAS/GRAPH installed. Alternatively, you need to download World Cities Data and Code file. The zip file contains instructions on installing and using that alternate lookup data set at your site. See the LOOKUPCITY= option in the PROC GEOCODE documentation for details on specifying an alternate city lookup data set.

Postal Code Geocoding

U.S. ZIP Codes

SASHELP.ZIPCODE is a data set of U.S. ZIP code centroids installed for ZIP geocoding. Quarterly updates of this data set can be downloaded from the Misc. Updates page.

British Postcodes

Free centroid locations of Royal Mail postcodes in England and Scotland are available from the British Ordnance Survey in their Code-Point Open product. The CodePoint2Geocode.zip file contains user instructions on acquiring the Code-Point data and a SAS program to import it for lookup data use. 

Australian Postcodes

Free Australia Post postcode boundaries are available from the Australian Bureau of Statistics (ABS) in their Postal Areas (POA) file. The ABS2Geocode.zip file contains user instructions and a SAS program to import the boundary file, compute polygon centroids and create the lookup data. 

Canada Postcodes

The data vendor ZIPCodeDownload includes Canadian postcode centroid locations in their Premium Edition product. The ZIPCodeDownload2Geocode.zip file contains instructions and a SAS program to import the Premium Edition CSV file to create Canadian lookup data. 

Other Countries

Various third party data vendors provide postcode location data for specific countries. One vendor is MapMechanics. Use PROC IMPORT or a DATA step to import files for use as PROC GEOCODE lookup data.

ZIP+4 Geocoding

Free ZIP+4 Data

A file containing ZIP+4 centers from 2006 is available. These ZIP+4 locations are based on the 2006 Second Edition TIGER/Line files from the Census Bureau. That was the most recent TIGER/Line release which contained ZIP+4 values. This file will be updated when the Census Bureau replaces ZIP+4 values in a future TIGER release. As of the most recent TIGER release, the ZIP+4 values had not yet been restored. Download that 2006 ZIP+4 file here.

Alternate ZIP+4 Data

The GEO*Data product containing current ZIP+4 locations can be purchased from Melissa Data. The autocall macro %GCDMEL9 imports GEO*Data files into lookup data for ZIP+4 geocoding with PROC GEOCODE. See the SAS/GRAPH Mapping Reference for details on this macro.

IP Address Geocoding

Worldwide IP address geocoding is supported by all PROC GEOCODE releases. Lookup data is available from MaxMind in their free GeoLite databases. The autocall macro %MAXMIND imports GeoLite files to create lookup data. See PROC GEOCODE in the SAS/GRAPH Mapping Reference for details on this macro.