SAS Institute. The Power to Know

SAS(R) Data Quality Server 9.2: Reference

space
Previous Page | Next Page

The DQSCHEME Procedure

Example 4: Applying Schemes


In this example, the APPLY statement generates cleansed data in the VENDORS_OUT data set. All schemes are applied before the result is written into the output data set. The locale ENUSA is assumed to be loaded into memory as part of the locale list.

/* Create filerefs with required suffixes. */
filename city 'c:\my schemes\city.sch.bfd';
filename state 'c:\my schemes\state.sch.bfd';
filename org 'c:\my schemes\org.sch.bfd';

/* Create the input data set. */
data vendors;
  input city $char17. state $char22. company $char36.;
datalines;
Detroit          MI                     Ford Motor
Dallas           Texas                  Wal-mart Inc.
Washington       District of Columbia   Federal Reserve Bank

/* See Example 4: Applying Schemes for the full data set. */

Washington       District of Columbia   Federal Reserve Bank
Atlanta          GEORGIA                Target
;
run;

proc dqscheme data=vendors out=vendors_out bfd;
  create matchdef='City (Scheme Build)'
    var=city scheme=city_scheme locale='ENUSA';
  create matchdef='State (Scheme Build)' 
     var=state scheme=state_scheme locale='ENUSA';
  create matchdef='Organization (Scheme Build)' 
     var=company scheme=org_scheme locale='ENUSA';
  apply var=city scheme=city_scheme;
  apply var=state scheme=state_scheme;
  apply var=company scheme=org_scheme;
run;

title 'Result after applying all three SAS format schemes';
proc print data=work.vendors_out;
run;

Note that the APPLY statements do not specify a locale. Nor do they specify the scheme lookup method using the SCHEME_LOOKUP= option. Because neither the locale nor the lookup method is specified, the schemes are applied with the ENUSA locale, which was stored in the schemes when they were created. Also, the default scheme lookup method (SCHEME_LOOKUP=EXACT) specifies that the value in the scheme replaces the input value in the output data set when an exact match is found between the input value and a DATA value in the scheme. The default scheme apply mode (MODE=PHRASE) is used, which means that the entirety of each input value is compared to the DATA values in the scheme.

This example is available in the SAS Sample Library under the name DQAPPLY.

space
Previous Page | Next Page | Top of Page