The XMLMap PATH element
supports several XPath forms to specify a location path. The location
path tells the XML engine where in the XML document to locate and
access a specific tag for the current variable. In addition, the location
path tells the XML engine to perform a function, which is determined
by the XPath form, to retrieve the value for the variable.
This example imports
an XML document and illustrates each of the supported XPath forms,
which include three element forms and two attribute forms.
Here is the XML document
NHL.XML to be imported:
<?xml version="1.0" encoding="iso-8859-1" ?>
<NHL>
<CONFERENCE> Eastern
<DIVISION> Southeast
<TEAM founded="1999" abbrev="ATL"> Thrashers </TEAM>
<TEAM founded="1997" abbrev="CAR"> Hurricanes </TEAM>
<TEAM founded="1993" abbrev="FLA"> Panthers </TEAM>
<TEAM founded="1992" abbrev="TB" > Lightning </TEAM>
<TEAM founded="1974" abbrev="WSH"> Capitals </TEAM>
</DIVISION>
</CONFERENCE>
</NHL>
Here is the XMLMap used
to import the XML document, with notations for each XPath form on
the PATH element:
<?xml version="1.0" ?>
<SXLEMAP version="1.2">
<TABLE name="TEAMS">
<TABLE-PATH syntax="XPath">
/NHL/CONFERENCE/DIVISION/TEAM
</TABLE-PATH>
<COLUMN name="ABBREV">
<PATH syntax="XPath">
/NHL/CONFERENCE/DIVISION/TEAM/@abbrev 1
</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>3</LENGTH>
</COLUMN>
<COLUMN name="FOUNDED">
<PATH syntax="XPath">
/NHL/CONFERENCE/DIVISION/TEAM/@founded[@abbrev="ATL"] 2
</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>10</LENGTH>
</COLUMN>
<COLUMN name="CONFERENCE" retain="YES">
<PATH syntax="XPath">
/NHL/CONFERENCE 3
</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>10</LENGTH>
</COLUMN>
<COLUMN name="TEAM">
<PATH syntax="XPath">
/NHL/CONFERENCE/DIVISION/TEAM[@founded="1993"] 4
</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>10</LENGTH>
</COLUMN>
<COLUMN name="TEAM5">
<PATH syntax="XPath">
/NHL/CONFERENCE/DIVISION/TEAM[position()=5] 5
</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>10</LENGTH>
</COLUMN>
</TABLE>
</SXLEMAP>
1 |
The
Abbrev variable uses the attribute form that selects values from a
specific attribute. The engine scans the XML markup until it finds
the TEAM element. The engine retrieves the value from the abbrev=
attribute, which results in each team abbreviation.
|
2 |
The
Founded variable uses the attribute form that conditionally selects
from a specific attribute based on the value of another attribute.
The engine scans the XML markup until it finds the TEAM element. The
engine retrieves the value from the founded= attribute where the value
of the abbrev= attribute is ATL, which results in the value 1999.
The two attributes must be for the same element.
|
3 |
The
Conference variable uses the element form that selects PCDATA from
a named element. The engine scans the XML markup until it finds the
CONFERENCE element. The engine retrieves the value between the <CONFERENCE>
start tag and the </CONFERENCE> end tag, which results in the
value Eastern.
|
4 |
The
Team variable uses the element form that conditionally selects PCDATA
from a named element. The engine scans the XML markup until it finds
the TEAM element where the value of the founded= attribute is 1993.
The engine retrieves the value between the <TEAM> start tag
and the </TEAM> end tag, which results in the value Panthers.
|
5 |
The
Team5 variable uses the element form that conditionally selects PCDATA
from a named element based on a specific occurrence of the element.
The position function tells the engine to scan the XML markup until
it finds the fifth occurrence of the TEAM element. The engine retrieves
the value between the <TEAM> start tag and the </TEAM>
end tag, which results in the value Capitals.
|
The following SAS statements
import the XML document NHLShort.XML and specify the XMLMap named
NHL1.MAP. The PRINT procedure shows the resulting variables with selected
values:
filename NHL 'C:\My Documents\XML\NHLShort.xml';
filename MAP 'C:\My Documents\XML\NHL1.map';
libname NHL xml xmlmap=MAP;
proc print data=NHL.TEAMS noobs;
run;
PRINT Procedure Output Showing Resulting Variables with Selected
Values