Previous Page | Next Page

Importing XML Documents Using an XMLMap

Including a Key Field with Generated Numeric Keys

This example imports the XML document Pharmacy.XML, which contains hierarchical data and is used in the example Importing Hierarchical Data as Related Data Sets. This example continues with the XMLMap by adding a key field with generated numeric key values in order to provide a relationship between the two data sets. (A key field holds unique data in order to identify that record from the other records. For example, account number, product code, and customer name are typical key fields.)

To generate key field values, use the class="ORDINAL" attribute in the COLUMN element in order to create a counter variable. A counter variable keeps track of the number of times the location path, which is specified by the INCREMENT-PATH element, is encountered. The counter variable increments its count by 1 each time the location path is matched. (The counter variable is similar to the _N_ automatic variable in DATA step processing in that it counts the number of observations being read into a SAS data set.)

Note:   When using a counter variable to create a key field for related data sets, you must specify the same location paths for both TABLE elements. Otherwise, the results will not match. Each table must have the same generated key for like-named data elements.  [cautionend]

The following XMLMap imports Pharmacy.XML document as two SAS data sets that have related information and also creates a key field that holds generated numeric key values:

<?xml version="1.0" encoding="UTF-8" ?>

<!-- ############################################################ -->
<!-- 2003-04-15T10:55:43 -->
<!-- SAS XML Libname Engine Map -->
<!-- Generated by XML Mapper, 9.1.10.20030413.1400 -->
<!-- ############################################################ -->

<SXLEMAP version="1.2" name="SXLEMAP">

  <!-- ############################################################ -->
  <TABLE name="PERSON">
    <TABLE-PATH syntax="XPATH">/PHARMACY/PERSON</TABLE-PATH> 1 

    <COLUMN name="KEY" retain="YES" class="ORDINAL"> 2 
      <INCREMENT-PATH syntax="XPATH">/PHARMACY/PERSON</INCREMENT-PATH>
      <TYPE>numeric</TYPE>
      <DATATYPE>integer</DATATYPE>
      <FORMAT width="3">Z</FORMAT>
    </COLUMN>

    <COLUMN name="NAME">
      <PATH  syntax="XPATH">/PHARMACY/PERSON/NAME</PATH>
      <TYPE>character</TYPE>
      <DATATYPE>string</DATATYPE>
      <LENGTH>11</LENGTH>
    </COLUMN>

    <COLUMN name="STREET">
      <PATH  syntax="XPATH">/PHARMACY/PERSON/STREET</PATH>
      <TYPE>character</TYPE>
      <DATATYPE>string</DATATYPE>
      <LENGTH>18</LENGTH>
    </COLUMN>

    <COLUMN name="CITY">
      <PATH  syntax="XPATH">/PHARMACY/PERSON/CITY</PATH>
      <TYPE>character</TYPE>
      <DATATYPE>string</DATATYPE>
      <LENGTH>6</LENGTH>
    </COLUMN>

  </TABLE>

  <!-- ############################################################ -->
  <TABLE name="PRESCRIPTION">
    <TABLE-PATH syntax="XPATH">/PHARMACY/PERSON/PRESCRIPTION</TABLE-PATH> 3 

    <COLUMN name="KEY" retain="YES" class="ORDINAL"> 4 
      <INCREMENT-PATH syntax="XPATH">/PHARMACY/PERSON</INCREMENT-PATH>
      <TYPE>numeric</TYPE>
      <DATATYPE>integer</DATATYPE>
      <FORMAT width="3">Z</FORMAT>    
    </COLUMN>

    <COLUMN name="NUMBER">
      <PATH  syntax="XPATH">/PHARMACY/PERSON/PRESCRIPTION/NUMBER</PATH>
      <TYPE>numeric</TYPE>
      <DATATYPE>integer</DATATYPE>
    </COLUMN>

    <COLUMN name="DRUG">
      <PATH  syntax="XPATH">/PHARMACY/PERSON/PRESCRIPTION/DRUG</PATH>
      <TYPE>character</TYPE>
      <DATATYPE>string</DATATYPE>
      <LENGTH>12</LENGTH>
    </COLUMN>

  </TABLE>

</SXLEMAP>

The following explains the XMLMap syntax that generates the key fields:

  1. In the TABLE element that defines the Person data set, the TABLE-PATH element identifies the observation boundary for the data set. The location path generates a new observation each time a PERSON element is read.

  2. For the Person data set, the COLUMN element for the Key variable contains the class="ORDINAL" attribute as well as the INCREMENT-PATH element. The XML engine follows this process in order to generate the key field values for the Person data set:

    1. When the XML engine encounters the <PERSON> start tag, it reads the value into the input buffer, and then increments the value for the Key variable by 1.

    2. The XML engine continues reading values into the input buffer until it encounters the </PERSON> end tag, at which time it writes the completed input buffer to the SAS data set as one observation.

    3. The process is repeated for each <PERSON> start tag (from INCREMENT-PATH) and </PERSON> end tag (from TABLE-PATH) sequence.

    4. The result is four variables and two observations.

  3. In the TABLE element that defines the Prescription data set, the TABLE-PATH element identifies the observation boundary for the data set. The location path generates a new observation each time a PRESCRIPTION element is read.

  4. For the Prescription data set, the COLUMN element for the Key variable contains the class="ORDINAL" attribute as well as the INCREMENT-PATH element.

    The XML engine follows this process in order to generate the key field values for the Prescription data set:

    1. When the XML engine encounters the <PERSON> start tag, it reads the value into the input buffer, and then increments the value for the Key variable by 1.

    2. The XML engine continues reading values into the input buffer until it encounters the </PRESCRIPTION> end tag, at which time it writes the completed input buffer to the SAS data set as one observation.

      Note:   Because the location paths for the counter variables must be the same for both TABLE elements, the behavior of the XML engine for the Prescription table Key variable is the same as the Person table Key variable. While the XML engine tracks the occurrence of a PERSON tag as a key for both counter variables, the observations are derived from different TABLE-PATH locations.  [cautionend]

    3. The process is repeated for each <PERSON> start tag (from INCREMENT-PATH) and </PRESCRIPTION> end tag (from TABLE-PATH) sequence.

    4. The result is three variables and three observations.

The following SAS statements import the XML document:

filename pharm 'c:\My Documents\XML\Pharmacy.xml';
filename map 'c:\My Documents\XML\PharmacyOrdinal.map';
libname pharm xml92 xmlmap=map;

Here is PROC PRINT output for both of the imported SAS data sets with a numeric key:

PROC PRINT Output for PHARM.PERSON

                     The SAS System                                         1

     Obs         KEY    NAME           STREET                CITY

       1         001    Brad Martin    11900 Glenda Court    Austin
       2         002    Jim Spano      1611 Glengreen        Austin

PROC PRINT Output for PHARM.PRESCRIPTION

                    The SAS System                                          2

              Obs         KEY      NUMBER    DRUG

                1         001        1234    Tetracycline
                2         001        1245    Lomotil
                3         002        1268    Nexium

Previous Page | Next Page | Top of Page