PROC GEOCODE
PROC GEOCODE is a SAS/GRAPH
procedure. The first version shipped in SAS 9.2 and provided geocoding by ZIP
Code, ZIP+4, city/state or IP address. Street level geocoding was added in the
third maintenance release of SAS 9.2 (TS2M3).
TIGER2GEOCODE update added Jul2011.
SGF Paper
A
2010 SAS Global Forum presentation discussed geocoding in general and contained
specific examples of PROC GEOCODE’s capabilities.
Read
the paper (PDF)
GEOCODE paper SGF 2010.pdf
Download
the example SAS programs (ZIP)
Examples for GEOCODE paper SGF 2010.zip
Street
Method Lookup Data
The
street geocoding method uses the street name and house number to locate the
address along that street. Options for acquiring the required lookup data for
street geocoding follow.
·
Street geocoding
lookup data for the entire US is in a zip archive which contains:
o
Comma-separated
values (CSV) files of geocoding data
o
ReadMe.txt
with instructions and metadata
o
ImportCVSfiles.sas to import CSV files into data sets
The
lookup data sets were created from US Census Bureau TIGER/Line shapefiles.
The zip archive is 1.1 Gb and downloads slowly. When
unzipped the CSV files require 9 Gb of disk space, and
the imported lookup data sets require 10 Gb. Note that these lookup data sets
are not the same as those used by the SAS/GIS batch geocoder.
This zip archive contains lookup data (Version 6, 30SEP2010) generated
from 2009 TIGER files.
Download the US street level lookup data file
StreetLookupData-2009.zip
This zip archive contains lookup data (Version 7, 09AUG2011) generated
from 2010 TIGER files.
Download the US street level lookup data file
StreetLookupData-2010.zip
This zip archive contains lookup data (Version 8, 02FEB2012) generated
from 2011 TIGER files.
Download the US street level lookup data file
StreetLookupData-2011.zip
·
The SAS macro program
that imported the Census Bureau TIGER/Line files to create the US lookup data sets
noted above is also available. Note that this imports only TIGER/Line
shapefiles which were first used by the Census Bureau for TIGER release 2007.
It does not import TIGER releases in the older RT file format.
This program can be used to import TIGER shapefiles for specific states and
counties of the US if you do not wish to geocode with the entire US lookup data
available above. Download
TIGER2GEOCODE.zip
to import TIGER shapefiles. Usage
instructions are in the program file header.
·
Currently lookup
data created from TIGER/Line shapefiles is the only source of street level
geocoding data available for PROC GEOCODE. Given sufficient demand, we can
provide an import mechanism for third party geocoding data. Please submit requests
through the feedback area of this web site. Also, you are welcome to modify the TIGER2GEOCODE.zip macro program
available above to import data from another source.
ZIP+4
Method Lookup Data
The
PLUS4 method in PROC GEOCODE allows geocoding by ZIP+4 centers. If the PLUS4 geocoding
method does not find a match, PROC GEOCODE defaults to the center of the ZIP
Code and then if necessary to the city/state center. Options for acquiring the
lookup data needed for geocoding by ZIP+4 follow.
·
You can download
the file ZIP4_Geocode_Data-2006.zip containing:
o
ZIP4.cpo, a
SAS transport file with a data set of ZIP+4 lookup data for the US
o
ReadMe.txt
with instructions
o
cimport.sas
to import the transport file into a lookup data set
The
ZIP+4 centers are based on the 2006 Second Edition TIGER/Line files from the US
Census Bureau. That was the most recent TIGER/Line release which contained
ZIP+4 values. The Census Bureau has said they will include updated ZIP+4 values
in a future TIGER release but have not specified which one.
Download
the ZIP+4 lookup data file
ZIP4_Geocode_Data-2006.zip
This is a large file and downloads slowly. The
zipped file is 300 Mb. Unzipped it requires 1.8 Gb of
disk space.
·
The GEO*Data product containing ZIP+4 centroids can be purchased from
Melissa Data:
http://www.melissadata.com/geocoder/geodata.htm
The autocall macro %GCDMEL9
is provided to import GEO*Data files into lookup data for ZIP+4 geocoding with
PROC GEOCODE. See the SAS/GRAPH documentation for details on its use.
Non-US Postcode
Geocoding Data
The
ZIP method in PROC GEOCODE supports geocoding by any postal code, not only US ZIP codes.
If you have a source of postal codes with locations (longitude/latitude) and import them into
a SAS data set, they can be used as the lookup data for the ZIP method.
The British Ordnance Survey
provides the Code-Point Open product containing Royal Mail
postcodes with X/Y locations. The SAS macro program %CodePoint2Deocode imports those files for use
with PROC GEOCODE. It also converts the British National Grid X/Y coordinates to longitude/latitude
in the World Geodetic System 1984 (WGS84) datum.
Download
the file
CodePoint2Geocode.zip
for the Code-Point Open program and html documentation.
As we obtain additional sources of free
geocoding data for non-US locations, they will be added to this section.
SAS/GIS GEOCODING
A batch geocoder is included with the SAS/GIS product.
It is not the same geocoder as PROC GEOCODE which is intended to replace it.
SGF Paper
A
presentation was given at SUGI 30 entitled “Cheap Geocoding: SAS/GIS and Free
TIGER Data.” It discussed geocoding concepts
and examined several techniques for geocoding address data using the SAS/GIS batch
geocoder and TIGER/Line files from the Census Bureau.
The
paper includes examples of:
·
ZIP Code
centroid geocoding with the Data step MERGE statement and SASHELP.ZIPCODE*
·
Randomly
scatter locations about geocoded X/Y values**
·
Geocode by
city/state with the Data step and SASHELP.ZIPCODE*
·
Plot
geocoded points with PROC GMAP**
·
Street level
geocoding*
*This
geocoding process is more easily accomplished using the new PROC GEOCODE.
**Post-geocoding
process that can also be done after using PROC GEOCODE.
Read the paper (PDF)
Download the example SAS programs (ZIP)
SAS/GIS
Geocoding Lookup Data
Lookup
data for the SAS/GIS geocoder has a slightly different format than the PROC
GEOCODE lookup data. This zip archive contains pre-built nationwide data used
by SAS/GIS for street level geocoding in the US.
This
lookup data file is based on the 2009 TIGER/Line files. It also contains a
sample geocoding program and a US map.
This is a large file and downloads slowly.
The zipped file is 2.5 Gb. Unzipped it requires 17.8 Gb
of disk space.
Download SAS/GIS Geocoding Data
Gis_Geocoding-2009.zip