Elements for Tables

Define the SAS data set.

Syntax

TABLE name="data-set-name"
TABLE-PATH syntax="type"
TABLE-END-PATH syntax="type" beginend="BEGIN | END"
TABLE-DESCRIPTION

Elements

TABLE name="data-set-name"
is an element that contains a data set definition. For example, <TABLE name="channel">.
name="data-set-name"
specifies the name for the SAS data set. The name must be unique in the XMLMap, and the name must be a valid SAS name, which can be up to 32 characters.
Requirement:The name= attribute is required.
Requirement:The TABLE element is required.
Interaction:The TABLE element can contain one or more of the following elements: TABLE-PATH, TABLE-END-PATH, TABLE-DESCRIPTION, and COLUMN.
TABLE-PATH syntax="type"
specifies a location path that tells the XML engine where in the XML document to locate and access specific elements in order to collect variables for the SAS data set. The location path defines the repeating element instances in the XML document, which is the SAS data set observation boundary. The observation boundary is translated into a collection of rows with a constant set of columns.
For example, using the XML document RSS.XML, which is used in the example Using an XMLMap to Import an XML Document as Multiple SAS Data Sets, this TABLE-PATH element causes the following to occur:
<TABLE-PATH syntax="XPath"> /rss/channel/item </TABLE-PATH>
  1. The XML engine reads the XML markup until it encounters the <ITEM> start tag.
  2. The XML engine clears the input buffer, sets the contents to MISSING (by default), and scans elements for variable names based on the COLUMN element definitions. As values are encountered, they are read into the input buffer. (Note that whether the XML engine resets to MISSING is determined by the DEFAULT element as well as the COLUMN element retain= attribute.)
  3. When the </ITEM> end tag is encountered, the XML engine writes the completed input buffer to the SAS data set as a SAS observation.
  4. The process is repeated for each <ITEM> start-tag and </ITEM> end-tag sequence until the end-of-file is encountered in the input stream or until the TABLE-END-PATH (if specified) is achieved, which results in six observations.
syntax="type"
is an optional attribute that specifies the type of syntax in the location path. The syntax is valid XPath construction in compliance with the W3C specifications. For example, syntax="XPath".
Default:XPath
Requirements:The value must be XPath or XPathENR.

If an XML namespace is defined with the NAMESPACES element, you must specify the type of syntax as XPathENR (XPath with Embedded Namespace Reference). This is because the syntax is different from the XPath specification. For example, syntax="XPathENR".

CAUTION:
Specifying the table location path, which is the observation boundary, can be tricky due to start-tag and end-tag pairing.
The table location path determines which end tag causes the XML engine to write the completed input buffer to the SAS data set. If you do not identify the appropriate end tag, the result could be concatenated data instead of separate observations, or an unexpected set of columns. For examples, see Determining the Observation Boundary to Avoid Concatenated Data and Determining the Observation Boundary to Select the Best Columns.
Requirements:The TABLE-PATH element is required.

If an XML namespace is defined with the NAMESPACES element, you must include the identification number in the location path preceding the element that is being defined. The identification number is enclosed in braces. For example, <TABLE-PATH syntax="XPathENR">/Table/{1}Hurricane</TABLE-PATH>.

The XPath construction is a formal specification that puts a path description similar to UNIX on each element of the XML structure. Note that XPath syntax is case sensitive. For example, if an element tag name is uppercase, it must be uppercase in the location path. If it is lowercase, it must be lowercase. All location paths must begin with the root-enclosing element (denoted by a slash '/') or with the "any parent" variant (denoted by double slashes '//'). Other W3C documented forms are not currently supported.

TABLE-END-PATH syntax="type" beginend="BEGIN | END"
is an optional, optimization element that saves resources by stopping the processing of the XML document before the end of the file. The location path tells the XML engine where in the XML document to locate and access a specific element in order to stop processing the XML document.
For example, using the XML document RSS.XML, which is used in the example Using an XMLMap to Import an XML Document as Multiple SAS Data Sets, there is only one <CHANNEL> start tag and one </CHANNEL> end tag. With the TABLE-PATH location path <TABLE-PATH syntax="XPath"> /rss/channel </TABLE-PATH>, the XML engine would process the entire XML document, even though it does not store new data in the input buffer after it encounters the first <ITEM> start tag because the remaining elements no longer qualify. The TABLE-END-PATH location path <TABLE-END-PATH syntax="XPath" beginend="BEGIN"> /rss/channel/item </TABLE-END-PATH> tells the XML engine to stop processing when the <ITEM> start tag is encountered.
Therefore, with the two location path specifications, the XML engine processes only the highlighted data in the RSS.XML document for the CHANNEL data set, rather than the entire XML document:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<rss version="0.91">
   <channel>
      <title>WriteTheWeb</title>
      <link>http://writetheweb.com</link>
      <description>News for web users that write back
         </description>
      <language>en-us</language>
      <copyright>Copyright 2000, WriteTheWeb team.
         </copyright>
      <managingEditor>editor@writetheweb.com
         </managingEditor>
      <webMaster>webmaster@writetheweb.com</webMaster>
      <image>
         <title>WriteTheWeb</title>
         <url>http://writetheweb.com/images/mynetscape88.gif
            </url>
         <link>http://writetheweb.com</link>
         <width>88</width>
         <height>31</height>
         <description>News for web users that write back
            </description>
         </image>
      <item>          
         <title>Giving the world a pluggable Gnutella</title>
 		 <link>http://writetheweb.com/read.php?item=24</link>  		 
      <description>WorldOS is a framework on which to build programs 
          that work like Freenet or Gnutella-allowing distributed 
          applications using  peer-to-peer routing.</description> 
      </item>       
      <item>
         .
         .
         .
   </channel>
</rss>
syntax="type"
is an optional attribute that specifies the type of syntax in the location path. The syntax is valid XPath construction in compliance with the W3C specifications. The XPath form supported by the XML engine allows elements and attributes to be individually selected for exclusion in the generated SAS data set. For example, syntax="XPath".
Default:XPath
Requirements:The value must be XPath or XPathENR.

If an XML namespace is defined with the NAMESPACES element, you must specify the type of syntax as XPathENR (XPath with Embedded Namespace Reference). This is because the syntax is different from the XPath specification. For example, syntax="XPathENR".

beginend="BEGIN | END"
is an optional attribute that specifies to stop processing when either the element start tag is encountered or the element end tag is encountered.
Default:BEGIN
Default:Processing continues until the last end tag in the XML document.
Requirements:If an XML namespace is defined with the NAMESPACES element, you must include the identification number in the location path preceding the element that is being defined. The identification number is enclosed in braces. For example, <TABLE-END-PATH syntax="XPathENR">/Table/{1}Hurricane</TABLE-END-PATH>.

The XPath construction is a formal specification that puts a path description similar to UNIX on each element of the XML structure. Note that XPath syntax is case sensitive. For example, if an element tag name is uppercase, it must be uppercase in the location path. If it is lowercase, it must be lowercase. All location paths must begin with the root-enclosing element (denoted by a slash '/') or with the "any parent" variant (denoted by double slashes '//'). Other W3C documented forms are not currently supported.

Interaction:The TABLE-END-PATH element does not affect the observation boundary; that is determined with the TABLE-PATH element.
Tip:Specifying a location to stop processing is useful for an XML document that is hierarchical, but generally not appropriate for repeating instance data.
TABLE-DESCRIPTION
is an optional element that specifies a description for the SAS data set, which can be up to 256 characters. For example, <TABLE-DESCRIPTION> Data Set contains TV channel information </TABLE-DESCRIPTION>.