• Print  |
  • Feedback  |

FOCUS AREAS

XML LIBNAME engine

Base SAS

Using the SAS 9.0 XML LIBNAME Engine

This article summarizes the SAS 9.0 enhancements for the XML LIBNAME engine and introduces the new XML Atlas application.


Contents


So What's New?

Since the introduction of the SAS XML LIBNAME engine (SXLE) in SAS Release 8.1, each subsequent SAS release has improved the importing and exporting capabilities, providing enhancements and new functionality. SAS Version 9 continues the trend!

Using SXLE to import and export an XML document offers these enhancements in SAS Version 9:


XMLMap Syntax Version 1.1

To successfully import an XML document, SXLE requires a specific XML physical structure so that the engine can identify columns of data from collections of rows. If your XML document does not import successfully, you can tell SXLE how to interpret the XML markup in order to successfully import the XML document. You create a separate XML document, called an XMLMap file, that contains specific XMLMap syntax, which is XML markup. The XMLMap syntax tells SXLE how to interpret the XML markup into SAS data set(s), variables (columns), and observations (rows).

For SAS Version 9, the XMLMap syntax is Version 1.1, with several enhancements that are summarized below.

Valid XML Markup

The XMLMap elements now comply with the World Wide Consortium (W3C) recommended usage guidelines:

SXLEMAP Element

The SXLEMAP element, which is the primary (root) enclosing element to contain the definition of the data set(s), accepts attributes for the syntax version number, the name of the XMLMap file, and a description. For example:

<SXLEMAP version="1.1" name="Myxmlmap" description="sample XMLMap">

The version attribute specifies the version of the XMLMap syntax. SAS Version 9 upgrades XMLMap syntax to Version 1.1. However, to use the Version 1.1 syntax, you must specify the attribute version="1.1". The default is 1.0 and is retained for compatibility with the prior release of XMLMap.

It is recommended that you update existing XMLMap files to Version 1.1.

Tip: To automatically update an XMLMap file to Version 1.1, load the Version 1.0 XMLMap file into XML Atlas, then save the file.

TABLE-PATH Element

The TABLE-PATH element is renamed from TABLE_XPATH. The element specifies a location path that tells SXLE where in the XML document to locate and access specific elements in order to collect variables for the SAS data set. The location path defines the repeating element instances in the XML document, which is the SAS data set observation boundary. The observation boundary is translated into a collection of rows with a constant set of columns.

TABLE-PATH accepts a syntax type attribute. For Version 1.1, the supported syntax is a valid XPath construction in compliance with the W3C. For example:

<TABLE-PATH syntax="xpath"> /rss/channel </TABLE-PATH>

CAUTION: Specifying the table location path, which is the observation boundary, can be tricky due to start-tag and end-tag pairing. The table location path determines which end tag causes SXLE to write the completed input buffer to the SAS data set. If you do not identify the appropriate end tag, the result could be concatenated data instead of separate observations or an unexpected set of columns. See Why is specifying the observation boundary tricky?.

TABLE-END-PATH Element

The TABLE-END-PATH element is renamed from TABLE_END_XPATH. It is an optional, optimization element that saves resources by stopping the processing of the XML document before the end of file. By default, processing continues until the last end tag in the XML document. If you specify TABLE-END-PATH, the location path tells SXLE where in the XML document to locate and access a specific element in order to stop processing the XML document. Specifying a location to stop processing is useful for XML documents that are hierarchical, but generally not appropriate for repeating instance data. Note that the TABLE-END-PATH element does not affect the observation boundary; that is determined with the TABLE-PATH element.

TABLE-END-PATH accepts a syntax type attribute and an attribute to specify to stop processing when either the element start tag or element end tag is encountered. For Version 1.1, the supported syntax is a valid XPath construction in compliance with the W3C. For example:

<TABLE-END-PATH syntax="xpath" beginend="Begin"> /rss/channel/item </TABLE-END-PATH>

COLUMN Element

The COLUMN element has a new attribute ordinal="NO|YES". The attribute determines whether the variable is a counter variable (similar to the _N_ automatic variable in SAS DATA step processing) that keeps track of the number of times the location path specified by the INCREMENT-PATH element is encountered. The counter variable increments its count by 1 each time the path is matched. Counters can be useful for identifying individual occurrences of like-named data elements or for counting observations. The value for the ordinal= attribute also determines which column location path to use for collecting the column's values. The default is NO.

NO
determines that the variable is not a counter variable, requires the PATH element, and does not allow the INCREMENT-PATH and RESET-PATH elements.
YES
determines that the variable is a counter variable, requires the INCREMENT-PATH with the RESET-PATH element optional, and does not allow the PATH element.

See Creating a Counter Variable for an example. NEW - Example Corrected!

PATH Element

The PATH element is renamed from XPATH. The element specifies a location path that tells SXLE where in the XML document to locate and access a specific tag for the current variable, then perform a function as determined by the location path form (three forms are supported) in order to retrieve the value for the variable.

PATH accepts a syntax type attribute. For Version 1.1, the supported syntax is a valid XPath construction in compliance with the W3C. For example:

<PATH syntax="xpath"> /rss/channel/title </PATH>

For Version 9, whether PATH is required or not allowed is determined by the ordinal= attribute for the COLUMN element: if ordinal="NO", which is the default, PATH is required; if ordinal="YES", PATH is not allowed and the INCREMENT-PATH element is required.

For more information on the PATH element, see For the XMLMap PATH element, what XPath forms are supported?.

INCREMENT-PATH Element

The INCREMENT-PATH element is a new element that specifies a location path for a counter variable. The location path tells SXLE where in the XML document to increment the accumulated value for the counter variable by 1.

The element accepts a syntax type attribute and an attribute that specifies to stop processing when either the element start tag or end tag is encountered. For Version 1.1, the supported syntax for the location path is a valid XPath construction in compliance with the W3C. For example:

<INCREMENT-PATH syntax="xpath" beginend="Begin">

You establish the counter variable by specifying the COLUMN element attribute ordinal="YES". See Column Element.

RESET-PATH Element

The RESET-PATH element is a new element that specifies a location path for a counter variable. The location path tells SXLE where in the XML document to reset the accumulated value for the counter variable to 0. The RESET-PATH element is optional.

The element accepts a syntax type attribute and an attribute that specifies to stop processing when either the element start tag or end tag is encountered. For Version 1.1, the supported syntax for the location path is a valid XPath construction in compliance with the W3C. For example:

<INCREMENT-PATH syntax="xpath" beginend="Begin">

You establish a counter variable by specifying the COLUMN element attribute ordinal="YES". See Column Element.


XML Atlas

What if you don't want to type the XML markup in order to create an XMLMap file? With Version 9, you can use the new XML Atlas Java application in order to generate the XMLMap file.

Note: XML Atlas is preproduction for SAS Version 9.

What is XML Atlas?

SAS developers really like acronyms, so the name Atlas actually stands for Assistive Technology for Leveraging Acquisition via SXLE!

XML Atlas assists you in creating and modifying XMLMap files for use by SXLE. XML Atlas provides a graphical interface that you use to generate the appropriate XML markup. XML Atlas analyzes the structure of an XML document and generates basic XML markup for the XMLMap file.

The interface consists of windows, a menu bar, and a tool bar. Using XML Atlas, you can display an XML document, create and modify an XMLMap file, and generate example SAS programs.

Default image of XML Atlas interface

Using the Windows

The XML window and the Map window are the two primary windows. The XML window, which is on the left, displays an XML document in a tree structure. The Map window, which is on the right, displays an XMLMap file in a tree structure. The map tree displays three layers: top level is the map itself, second tier are tables, and the leaf nodes are columns. The detail area at the top displays information about the currently selected item, such as attributes for the table or column. The information is subdivided into tabs.

The small windows on the bottom display generated SAS code, the XMLMap file, and an XML document.

Using the Menu Bar

The menu bar provides pull-down menus in order to request functionality. For example, select the File menu, then Open XML in order to display a browser so that you can select an XML document to open.

XML Atlas menu bar

File menu
provides options for opening, saving, and closing an XML document, an XMLMap file, and a SAS code file, and an option to close XML Atlas.
Edit menu
provides options for deleting, copying, and pasting items in an XMLMap file
View menu
provides options for opening and closing the three source windows (at the bottom of the interface) and an option to generate code.
Window menu
provides options for controlling the arrangement of the three source windows (at the bottom of the interface), that is, arranging horizontally, vertically, or cascading.
Help menu
provides options for displaying the online help and XML Atlas version information.

Using the Tool Bar

The tool bar contains shortcuts for several items on the menu bar. For example, the first icon from the left is the Open XML icon. Select it in order to display a browser to so that you can select an XML document to open.

XML Atlas tool bar

An Example of Creating an XMLMap File

Here's a simple example that walks you through using XML Atlas in order to create an XMLMap file. For the following XML document, an XMLMap file is necessary because the XML does not adhere to the physical structure that SXLE requires. Without an XMLMap file, SXLE would import a data set named FORD with columns ROW0, MODEL0, YEAR0, ROW1, MODEL1, YEAR1, and so on. (For an explanation as to why an XMLMap file is needed and more information on creating the XMLMap file, see Determining the Observation Boundary in Order to Avoid Concatenated Data.)
<?xml version="1.0" encoding="windows-1252" ?>
<VEHICLES>
  <FORD>
    <ROW>
      <Model>Mustang</Model>
      <Year>1965</Year>
    </ROW>
    <ROW>
      <Model>Explorer</Model>
      <Year>1982</Year>
    </ROW>
    <ROW>
      <Model>Taurus</Model>
      <Year>1998</Year>
    </ROW>
    <ROW>
      <Model>F150</Model>
      <Year>2000</Year>
    </ROW>
  </FORD>
</VEHICLES>

Display the XML Document in XML Atlas

  1. From the menu bar, select File, then Open XML. A browser displays.

  2. From the displayed browser, select the XML document.
The XML document displays in the primary XML window as well as in the XML source window at the bottom. In addition, XML Atlas begins generating the SAS code and the XMLMap file.

Tip: The primary XML window displays the document in a tree structure. Click the + signs in order to open the elements.

XML Atlas displaying XML document

Create the XMLMap File

These steps will generate an XMLMap file for the displayed XML document:
  1. In the primary Map window, XML Atlas automatically provides the SXLEMAP item, which corresponds to the XMLMap syntax SXLEMAP element. To specify attributes for SXLEMAP, select the item, then use the tabs at the top of the Map window. For example, from the Properties tab, enter a description for the XMLMap file, and set the XMLMap syntax version to 1.1 with the Validation tab.

  2. Create a TABLE element by dragging an item from the XML window and dropping it on the SXLEMAP element in the Map window. For example, drag and drop ROW on SXLEMAP.

  3. To specify attributes for the TABLE element, select the item, then use the tabs at the top of the Map window. For example, from the Properties tab, enter a description. XML Atlas fills in the XPath location, which corresponds to the TABLE-PATH element.

    XML Atlas displaying XMLMap TABLE element

  4. Create a COLUMN element by dragging an item from the XML window and dropping it on the desired TABLE element in the Map window. For example, drag and drop Model on ROW.

    Tip: To create the columns, use the condensed display in the XML window by selecting the Condensed tab. The condensed display contains metadata that is not available in the full display. For example, in the full display, the length property, which is an estimate based on the length of a single instance of data, is useful for structured fields such as ID numbers, codes, and so on, but it is not helpful for free-form text. Using the full display tends to result in clipped text, whereas the condensed display will calculate the maximum length.

  5. To specify attributes for the column, select Model, then use the tabs. For example, from the Properties tab, enter a description. The other attributes are fine.

    XML Atlas displaying first XMLMap COLUMN element

  6. Create another COLUMN element by dragging and dropping Year on ROW. Then specify attributes for the column.

    XML Atlas displaying second XMLMap COLUMN element

  7. Save the XMLMap file by selecting File from the menu bar, Save Map As, and specifying a name for the XMLMap file.

Generate XMLMap File and SAS Code

In addition to generating the XMLMap file, XML Atlas generates basic FILENAME and LIBNAME statements in order to use an XMLMap file. XML Atlas also generates some sample usage statements for the DATASETS procedure and the CONTENTS procedure. The generated SAS code displays in the SAS code window at the bottom.

Tip: So that the generated SAS code includes the location of the XMLMap file, be sure to save the file first.

  1. Generate the XMLMap file and the associated SAS code by selecting View from the menu bar, then Update text files. The XMLMap file displays in the Map text window, and XML Atlas generates SAS code in the SAS code window.

  2. Save the generated SAS code by selecting File from the menu bar, Save SAS As, then specify a name for the SAS code.
Here is the generated SAS code:
/************************************************************
 *  Generated by XMLAtlas, v. 9.0.1
 ************************************************************/

/*
 *  ENVIRONMENT
 */
filename  path 'C:\Documents and Settings\sasdxw\My Documents\XML\path.xml';
filename  SXLEMAP 'c:\documents and settings\sasdxw\my documents\xml\test.map';
libname   path xml xmlmap=SXLEMAP access=READONLY;

/*
 *  CATALOG
 */

proc datasets lib=path; run;

/*
 *  SAMPLE USAGE
 */

proc contents data=path.ROW varnum; run;
proc print data=path.ROW; run;

Submit SAS Code

Read the SAS code into SAS and submit.

Here is the results of the PRINT procedure:


                  The SAS System                                         2

                 Obs    Model      Year

                   1    Mustang    1965
                   2    Explore    1982
                   3    Taurus     1998
                   4    F150       2000


How Do I Get XML Atlas?

XML Atlas is available for installation from your SAS Installation Kit. XML Atlas is on the SAS Client-Side Components CD.

After XML Atlas is installed, you should have an icon available on your desktop for the application. Simply double click the XML Atlas icon in order to invoke it.

XML Atlas has online help attached. From the menu bar, select Help, then Help Topics.


Where's the Version 9 Documentation for SXLE?

When SXLE was introduced in SAS Release 8.1, the documentation for using the engine was available only through the SAS System Help. Subsequently, two topics about using SXLE were provided from the Base Communities web site.

For Version 9, all information about invoking and using SXLE is available in one document. There are several ways that you can locate the latest version of the documentation for SXLE:


Frequently Asked Questions?

Here's a list of some frequently asked questions...

Is SXLE available on all hosts supported by SAS?

Yes, SXLE is available on Windows, UNIX, OpenVMS Alpha, and recently OS/390.

Note, however, that for Release 9.0, the preproduction XML Atlas is available only for Windows.

Is SXLE production?

Yes. For Release 9.0, the preproduction XMLMap facility is production as well.

XML Atlas, though, is preproduction for 9.0.

Is SXLE a DOM or SAX application?

Currently, SXLE uses a SAX (Simple API for XML) model. SAX does not provide a random access lookup to the document's contents; it scans the document sequentially and presents each item to the application only one time. In contrast, the Document Object Model (DOM) converts the document's contents into a node tree that can be traversed back and forth via the programming interface (API).

Why is specifying the observation boundary tricky?

The observation boundary determines the repeating element instances in the XML document, which translates into a collection of rows with a constant set of columns.

You determine the observation boundary by specifying a table location path that tells SXLE where in the XML document to locate and access specific elements in order to collect variables for the SAS data set.

Specifying the table location path can be tricky due to start-tag and end-tag pairing. The table location path determines which end tag causes SXLE to write the completed input buffer to the SAS data set. If you do not identify the appropriate observation boundary, the result could be concatenated data instead of separate observations, or an unexpected set of columns.

For examples, see Determining the Observation Boundary in Order to Avoid Concatenated Data and Determining the Observation Boundary in Order to Select the Best Columns.

For the XMLMap PATH element, what XPath forms are supported?

The PATH element specifies a location path that tells SXLE where in the XML document to locate and access a specific tag for the current variable, then perform a function as determined by the location path form (three forms are supported) in order to retrieve the value for the variable. The XPath forms that are supported allow elements and attributes to be individually selected for inclusion in the generated (rectangular) SAS data set.

To specify the PATH location path, use one of the following forms. These forms are the only Xpath forms that SXLE supports. If you use any other valid W3C form, the results will be unpredictable.

element-form
accesses PCDATA (parsable character data) from the named element.
<PATH syntax="xpath"> /rss/channel/title </PATH>

The above example tells SXLE to scan the XML markup until it finds the specific TITLE element. SXLE retrieves the value between the <TITLE> start tag and the </TITLE> end tag.

attribute-form
accesses data from the named attribute (of the form NAME="value").
<PATH syntax="xpath"> /rss@version </PATH>

The above example tells SXLE to scan the XML markup until it finds the specific RSS element. SXLE retrieves the value from the version= attribute in the RSS element.

value-form
accesses PCDATA from the named element with a specific attribute value.
<PATH syntax="xpath"> /constant[@name="PI"] </PATH>

If the XML contains the following, the above example tells SXLE to scan the XML markup until it finds the specific CONSTANT element where the value of the name= attribute is PI. SXLE would retrieve the value 3.14159.

<constant name="PI"> 3.14159 </constant>

Your Turn

The developers, testers, and documenters that bring you SXLE are very interested in your feedback. You can send electronic mail to XMLEngine@sas.com with your comments.


Last Updated: 14 May 2004