Previous Page | Next Page

The DQMATCH Procedure

Example 5: Clustering with Multiple CRITERIA Statements


The following example assigns cluster numbers based on a logical OR of two pairs of CRITERIA statements. Each pair of CRITERIA statements is evaluated as a logical AND. The cluster numbers are assigned based on a match between the customer name and address, or the organization name and address.

  /* Load the ENUSA locale. The system option DQSETUPLOC= is already set. */
   %dqload(dqlocale=(enusa))

   data customer;
      length custid 8 name org addr $ 20;
      input custid name $char20. org $char20. addr $char20.;
   datalines;
   1  Mr. Robert Smith     Orion Star Corporation    8001 Weston Blvd.
   2                                    The Orion Star Corp.       8001 Westin Ave
   3  Bob Smith                                                           8001 Weston Parkway
   4  Sandi Booth              Belleview Software         123 N Main Street
   5  Mrs. Sandra Booth   Belleview Inc.                  801 Oak Ave.
   6  sandie smith Booth  Orion Star Corp.              123 Maine Street
   7  Bobby J. Smythe       ABC Plumbing                8001 Weston Pkwy
   ;
   run;

   /* Generate the cluster data. Because more than one condition
      is defined, a variable named CLUSTER is created automatically */
   proc dqmatch data=customer
                out=customer_out;
      criteria condition=1 var=name sensitivity=85 matchdef='Name';
      criteria condition=1 var=addr sensitivity=70 matchdef='Address';

      criteria condition=2 var=org  sensitivity=85 matchdef='Organization';
      criteria condition=2 var=addr sensitivity=70 matchdef='Address';
   run;

   /* Print the result. */
   proc print data=customer_out noobs;
   run;

The output is as follows:

custid   name                         org                                 addr                               CLUSTER
     4    Sandi Booth             Belleview Software     123 N Main Street         1
     6    sandie smith Booth Orion Star Corp.           123 Maine Street          1
     1    Mr. Robert Smith    Orion Star Corporation 8001 Weston Blvd.       2
     7    Bobby J. Smythe     ABC Plumbing               8001 Weston Pkwy      2
     3    Bob Smith                                                       8001 Weston Parkway 2
     2                                    The Orion Star Corp.     8001 Westin Ave          2
     5    Mrs. Sandra Booth Belleview Inc.                801 Oak Ave.              

In the preceding output, the two rows in cluster 1 matched on name and address. The rows in cluster 2 matched on name and address as well as organization and address. The inclusion of Bobby J. Smythe in cluster 2 indicates either a data error or a need for further refinement of the criteria and conditions. The last row in the output did not receive a cluster number because that row did not match any other rows.

Note:   This example is available in the SAS Sample Library under the name DQMLTCND.  [cautionend]

Previous Page | Next Page | Top of Page