Importing Hierarchical Data as Related Data Sets

XML documents often contain hierarchical data in that the data is structured into different levels like a company organization chart. Hierarchical structures are one-to-many relationships. Top items have one or more items below it (for example, customer to orders).
This example explains how to define an XMLMap in order to import an XML document as two data sets that have related information.
Here is the XML document Pharmacy.XML. The file contains hierarchical data with related entities in the form of individual customers and their prescriptions. Each customer can have one or multiple prescriptions. Notice that PRESCRIPTION elements are nested within each <PERSON> start tag and </PERSON> end tag:
<?xml version="1.0" ?>
<PHARMACY>
  <PERSON>
    <NAME>Brad Martin</NAME>
    <STREET>11900 Glenda Court</STREET>
    <CITY>Austin</CITY>
   <PRESCRIPTION>
    <NUMBER>1234</NUMBER>
    <DRUG>Tetracycline</DRUG>
  </PRESCRIPTION>
  <PRESCRIPTION>
    <NUMBER>1245</NUMBER>
    <DRUG>Lomotil</DRUG>
  </PRESCRIPTION>
  </PERSON>
 <PERSON>
    <NAME>Jim Spano</NAME>
    <STREET>1611 Glengreen</STREET>
    <CITY>Austin</CITY>
  <PRESCRIPTION>
    <NUMBER>1268</NUMBER>
    <DRUG>Nexium</DRUG>
  </PRESCRIPTION>
 </PERSON>
</PHARMACY>
To import separate data sets, one describing the customers and the other containing prescription information, a relation between each customer and associated prescriptions must be designated in order to know which prescriptions belong to each customer.
An XMLMap defines how to translate the XML markup into two SAS data sets. The Person data set imports the name and address of each customer, and the Prescription data set imports the customer's name, prescription number, and drug. Notations in the XMLMap syntax are explained below.
Note: The XMLMap was generated by using SAS XML Mapper.
<?xml version="1.0" encoding="UTF-8"?>
<!-- ############################################################ -->
<!-- 2011-01-10T14:39:38 -->
<!-- SAS XML Libname Engine Map -->
<!-- Generated by XML Mapper, 903000.1.0.20101208190000_v930 -->
<!-- ############################################################ -->
<!-- ###  Validation report                                   ### -->
<!-- ############################################################ -->
<!-- XMLMap validation completed successfully. -->
<!-- ############################################################ -->
<SXLEMAP name="AUTO_GEN" version="2.1"> 1

    <NAMESPACES count="0"/>

    <!-- ############################################################ -->
    <TABLE description="PERSON" name="PERSON"> 2
        <TABLE-PATH syntax="XPath">/PHARMACY/PERSON</TABLE-PATH>

        <COLUMN name="NAME"> 3
            <PATH syntax="XPath">/PHARMACY/PERSON/NAME</PATH>
            <TYPE>character</TYPE>
            <DATATYPE>string</DATATYPE>
            <LENGTH>11</LENGTH>
        </COLUMN>

        <COLUMN name="STREET"> 3
            <PATH syntax="XPath">/PHARMACY/PERSON/STREET</PATH>
            <TYPE>character</TYPE>
            <DATATYPE>string</DATATYPE>
            <LENGTH>18</LENGTH>
        </COLUMN>

        <COLUMN name="CITY"> 3
            <PATH syntax="XPath">/PHARMACY/PERSON/CITY</PATH>
            <TYPE>character</TYPE>
            <DATATYPE>string</DATATYPE>
            <LENGTH>6</LENGTH>
        </COLUMN>

    </TABLE>

    <!-- ############################################################ -->
    <TABLE description="PRESCRIPTION" name="PRESCRIPTION"> 4
        <TABLE-PATH syntax="XPath">/PHARMACY/PERSON/PRESCRIPTION</TABLE-PATH>

        <COLUMN name="NUMBER" retain="YES"> 6
            <PATH syntax="XPath">/PHARMACY/PERSON/PRESCRIPTION/NUMBER</PATH>
            <TYPE>numeric</TYPE>
            <DATATYPE>integer</DATATYPE>
        </COLUMN>

        <COLUMN name="DRUG"> 6
            <PATH syntax="XPath">/PHARMACY/PERSON/PRESCRIPTION/DRUG</PATH>
            <TYPE>character</TYPE>
            <DATATYPE>string</DATATYPE>
            <LENGTH>12</LENGTH>
        </COLUMN>

    </TABLE>

</SXLEMAP>
1 SXLEMAP is the root-enclosing element for the two SAS data set definitions.
2 First TABLE element defines the Person data set.
3 COLUMN elements contain the attributes for the Name, Street, and City variables in the Person data set.
4 Second TABLE element defines the Prescription data set.
5 COLUMN element contains the attributes for the Name variable in the Prescription data set. Specifying the retain="YES" attribute causes the name to be held for each observation until it is replaced by a different value. (The retain= attribute is like the SAS DATA step RETAIN statement, which causes a variable to retain its value from one iteration of the DATA step to the next.)
6 COLUMN elements contain the attributes for the Number and Drug variables in the Prescription data set.
The following SAS statements import the XML document and specify the XMLMap:
filename pharm 'c:\My Documents\Pharmacy.xml';
filename map 'c:\My Documents\Pharmacy.map';

libname pharm xmlv2 xmlmap=map;
quit;
The DATASETS procedure verifies that SAS interprets the XML document Pharmacy.XML as two SAS data sets: PHARM.PERSON and PHARM.PRESCRIPTION.
proc datasets library=pharm;
quit;
DATASETS Procedure Output for PHARM Library
DATASETS Procedure Output for PHARM Library
Here is the PRINT procedure output for both of the imported SAS data sets.
PRINT Procedure Output for PHARM.PERSON
PRINT Procedure Output for PHARM.PERSON
PRINT Procedure Output for PHARM.PRESCRIPTION
PRINT Procedure Output for PHARM.PRESCRIPTION