Previous Page | Next Page

Importing XML Documents Using an XMLMap

Specifying a Location Path on the PATH Element

The XMLMap PATH element supports several XPath forms to specify a location path. The location path tells the XML engine where in the XML document to locate and access a specific tag for the current variable. In addition, the location path tells the XML engine to perform a function, which is determined by the XPath form, to retrieve the value for the variable.

This example imports an XML document and illustrates each of the supported XPath forms, which include three element forms and two attribute forms.

First, here is the XML document NHL.XML to be imported:

<?xml version="1.0" encoding="iso-8859-1" ?>
<NHL>
  <CONFERENCE> Eastern 
    <DIVISION> Southeast  
      <TEAM founded="1999" abbrev="ATL"> Thrashers </TEAM>  
      <TEAM founded="1997" abbrev="CAR"> Hurricanes </TEAM>  
      <TEAM founded="1993" abbrev="FLA"> Panthers </TEAM>  
      <TEAM founded="1992" abbrev="TB" > Lightning </TEAM>  
      <TEAM founded="1974" abbrev="WSH"> Capitals </TEAM>  
   </DIVISION>
 </CONFERENCE> 
</NHL>

Here is the XMLMap used to import the XML document, with notations for each XPath form on the PATH element:

<?xml version="1.0" ?>
<SXLEMAP version="1.9">
  <TABLE name="TEAMS">
        <TABLE-PATH>
           /NHL/CONFERENCE/DIVISION/TEAM
         </TABLE-PATH>

        <COLUMN name="ABBREV">
          <PATH>
           /NHL/CONFERENCE/DIVISION/TEAM/@abbrev 1 
            </PATH>
            <TYPE>character</TYPE>
            <DATATYPE>STRING</DATATYPE>
            <LENGTH>3</LENGTH>
         </COLUMN>

         <COLUMN name="FOUNDED"> 
           <PATH>
              /NHL/CONFERENCE/DIVISION/TEAM@founded[@abbrev="ATL"] 2 
            </PATH>
            <TYPE>character</TYPE>
            <DATATYPE>STRING</DATATYPE>
            <LENGTH>10</LENGTH>
        </COLUMN>

       <COLUMN name="CONFERENCE" retain="YES"> 
          <PATH>
             /NHL/CONFERENCE 3 
           </PATH>
            <TYPE>character</TYPE>
            <DATATYPE>STRING</DATATYPE>
            <LENGTH>10</LENGTH>
        </COLUMN>

        <COLUMN name="TEAM"> 
           <PATH>
              /NHL/CONFERENCE/DIVISION/TEAM[@founded="1993"] 4 
            </PATH>
            <TYPE>character</TYPE>
            <DATATYPE>STRING</DATATYPE>
            <LENGTH>10</LENGTH>
        </COLUMN>

        <COLUMN name="TEAM5"> 
           <PATH>
              /NHL/CONFERENCE/DIVISION/TEAM[position()=5] 5  
             </PATH>
            <TYPE>character</TYPE>
            <DATATYPE>STRING</DATATYPE>
            <LENGTH>10</LENGTH>
        </COLUMN>   


    </TABLE>
</SXLEMAP>

  1. The Abbrev variable uses the attribute form that selects values from a specific attribute. The XML engine scans the XML markup until it finds the TEAM element. The XML engine retrieves the value from the abbrev= attribute, which results in each team abbreviation.

  2. The Founded variable uses the attribute form that conditionally selects from a specific attribute based on the value of another attribute. The XML engine scans the XML markup until it finds the TEAM element. The XML engine retrieves the value from the founded= attribute where the value of the abbrev= attribute is ATL, which results in the value 1999. The two attributes must be for the same element.

  3. The Conference variable uses the element form that selects PCDATA from a named element. The XML engine scans the XML markup until it finds the CONFERENCE element. The XML engine retrieves the value between the <CONFERENCE> start tag and the </CONFERENCE> end tag, which results in the value Eastern.

  4. The Team variable uses the element form that conditionally selects PCDATA from a named element. The XML engine scans the XML markup until it finds the TEAM element where the value of the founded= attribute is 1993. The XML engine retrieves the value between the <TEAM> start tag and the </TEAM> end tag, which results in the value Panthers.

  5. The Team5 variable uses the element form that conditionally selects PCDATA from a named element based on a specific occurrence of the element. The position function tells the XML engine to scan the XML markup until it finds the fifth occurrence of the TEAM element. The XML engine retrieves the value between the <TEAM> start tag and the </TEAM> end tag, which results in the value Capitals.

The following SAS statements import the XML document NHL.XML and specify the XMLMap named NHL1.MAP. The PRINT procedure shows the resulting variables with selected values:

filename NHL 'C:\My Documents\XML\NHL.xml';
filename MAP 'C:\My Documents\XML\NHL1.map';
libname NHL xml xmlmap=MAP;

proc print data=NHL.TEAMS noobs; 
run;

PROC PRINT of Data Set NHL.TEAMS

                          The SAS System                                         1

  ABBREV    FOUNDED       CONFERENCE    TEAM          TEAM5

   ATL      1999          Eastern
   CAR                    Eastern
   FLA                    Eastern       Panthers
   TB                     Eastern
   WSH                    Eastern                     Capitals

Previous Page | Next Page | Top of Page