Match codes can be used to identify members of the same household in a set of demographic
data. In order to do that, you could create a job flow that reads a table of demographic
data (CONTACTS); generates match codes and
cluster numbers for records that have the same last name and street address, and then writes
the match codes and cluster numbers to a
target table (CONTACTS_OFFICE_CLUSTER). The flow would look similar to the following figure.
-
Create an empty SAS Data Integration Studio job.
-
From the Data folder in the Transformations tree, select and drag a Create Match Code transformation
and drop it in the empty job on the Diagram tab
in the Job Editor window.
-
Select and drag the
source table from its folder and drop it before the Create Match Code transformation on
the
Diagram tab.
-
Drag the cursor from the source table to the input port of the Create Match Code transformation.
This action connects
the transformation to the source. In this example, the source is a table of contact
information called CONTACTS, which contains a large number of records. The data has
not been standardized, so the spelling of names and addresses might differ while still
referring to the same entities. The following display depicts the
source data. When the job is run, rows 1004 and 1005 receive the same cluster number, as do rows
1007 and 1008, despite the fact that the data varies in the COMPANY
and ADDRESS rows.
-
From the Transformations tab, under Access, drag a Table Loader transformation into the job and drop it after
the Create Match Code transformation.
-
Select and drag from the transformation's temporary output table to the Table Loader
transformation. This action connects the output of the transformation to the Table
Loader. The Table Loader is used to ensure that the target is always completely overwritten
each time the job is run. This default configuration for the Table Loader is depicted
in the following
display of the Table Loader's Load Technique tab.
-
Select and drag the target table from its folder and drop it after the Table Loader
transformation on the Diagram tab. In this example, the target is named CONTACTS_OFFICE_CLUSTER. The target contains
the same columns as the source, plus a numeric column named CLUSTER and a character
column named MATCH CODE (length
120).
-
Drag the cursor from the output port of the Table Loader transformation to the target
table. This action connects the transformation to the target.
-
To propagate and map
columns, right-click the Create Match Codes transformation and select
Propagate ColumnsTo Selected Transformation's
Sources SourcesFrom Targets. This action maps the source columns to the target and propagates the new columns
in the target into the Create Match Codes transformation.