GEOCODE Procedure

Example 1: Geocoding Using Default Values

Features:

ZIP geocoding method

Procedure options: OUT=

Other features:
Base SAS functions:
SAS DATA step
PRINT procedure
Data set: SASHELP.ZIPCODE (lookup data set)
Sample library member: GEOSMPL
This example shows the simplest form of the GEOCODE procedure, specifying only the OUT= option. The GEOCODE procedure compares the input data set to the lookup data and outputs any match that it finds based on a five-digit ZIP code. The ZIP method is the default.
The result of using all of the default values is that the following is true:
  • The input address data set is the most recently created SAS data set (this example assumes that you have just created WORK.CUSTOMERS).
  • The ZIP code geocoding method is used.
  • The lookup data set is SASHELP.ZIPCODE.
  • No variables are added to the output data set other than the X and Y coordinates, and a _MATCHED_ variable indicating whether and how the match was made.
The following output from PROC PRINT shows the output data set after running the GEOCODE procedure. Notice that the following geocoding variables have been added:
  • coordinate variables X and Y from the lookup data set (SASHELP.ZIPCODE).
  • a variable named _MATCHED_. This variable indicates whether the location was found by matching ZIP codes or by matching City and State (or whether no location was found because no match was made).

Output

The following output from the PRINT procedure shows the GEOCODED_CUSTOMERS output data set after running the GEOCODE procedure
The GEOCODED_CUSTOMERS Output Data Set
The GEOCODED_CUSTOMERS Output Data Set

Program

data CUSTOMERS (label="Customer data for geocoding");
infile datalines dlm='#';
length address $ 24 city $ 24 state $ 2;
input address    /* House number and street name */
      zip        /* Customer ZIP code (numeric)  */
      city       /* City name                    */
      state      /* State abbreviation           */
;
cust_ID = _n_;   /* Assign customer ID number    */
datalines;
555 Junk Street # 99999 # Beverly Hills # CA
115 E. Water St # 19901 # Dover #
760 Moose Lodge Road # 19934 # Camden #
200 S. Madison Str # 19801 # Wilmington # DE
4701 Limestone Road # 19808 # Wilmington #
2117 N 4th St # 19363 # Oxford # PA
1313 Mockingbird Lane # . # Delray # CC
133 Silver Lake Dr # 19971 # Rehoboth Beach # DE
11 SE Front Street # 19963 # Milford # DE
402 Nylon Boulevard # . # Seaford # DE
363 E Commerce St # . # Smyrna # DE
5595 Polly Branch Rd # 19975 # Selbyville # DE
1209 Coastal Highway # 19944 # Fenwick Island # DE
2899 Arthursville Rd # 19953 # Hartly # DE
41 Bramhall St # . #  #
9320 Old Racetrack Rd # . # Delmar # DE
281 W Commerce Str # 19955 # Kenton #
211 Blue Ball Road # 21921 # Elkton # MD
3893 Turkey Point Rd # 19980 # Woodside # DE
;
run;
proc geocode out=geocoded_customers;
run;
proc print data=geocoded_customers noobs;
run;
quit;

Program Description

Generate the CUSTOMERS input data set of addresses that the GEOCODE procedure will use.
data CUSTOMERS (label="Customer data for geocoding");
infile datalines dlm='#';
length address $ 24 city $ 24 state $ 2;
input address    /* House number and street name */
      zip        /* Customer ZIP code (numeric)  */
      city       /* City name                    */
      state      /* State abbreviation           */
;
cust_ID = _n_;   /* Assign customer ID number    */
datalines;
555 Junk Street # 99999 # Beverly Hills # CA
115 E. Water St # 19901 # Dover #
760 Moose Lodge Road # 19934 # Camden #
200 S. Madison Str # 19801 # Wilmington # DE
4701 Limestone Road # 19808 # Wilmington #
2117 N 4th St # 19363 # Oxford # PA
1313 Mockingbird Lane # . # Delray # CC
133 Silver Lake Dr # 19971 # Rehoboth Beach # DE
11 SE Front Street # 19963 # Milford # DE
402 Nylon Boulevard # . # Seaford # DE
363 E Commerce St # . # Smyrna # DE
5595 Polly Branch Rd # 19975 # Selbyville # DE
1209 Coastal Highway # 19944 # Fenwick Island # DE
2899 Arthursville Rd # 19953 # Hartly # DE
41 Bramhall St # . #  #
9320 Old Racetrack Rd # . # Delmar # DE
281 W Commerce Str # 19955 # Kenton #
211 Blue Ball Road # 21921 # Elkton # MD
3893 Turkey Point Rd # 19980 # Woodside # DE
;
run;
Run the GEOCODE procedure with the generated input data set. This example assumes that CUSTOMERS is the most recently generated input data set.The default lookup data set, SASHELP.ZIPCODE, is used. GEOCODE uses the default ZIP method to compare the input data set to the lookup data and match observations based on a five-digit ZIP code.
proc geocode out=geocoded_customers;
run;
Print the entire GEOCODED_CUSTOMERS output data set, suppressing the observation column.
proc print data=geocoded_customers noobs;
run;
quit;