| Getting Started with the XML Engine |
| Is the XML Engine a DOM or SAX Application? |
Currently, the XML engine can be either a DOM application or a SAX application, depending on what you are doing:
If the format type is either GENERIC (the default) or ORACLE, the XML engine uses a modified Document Object Model (DOM), which converts the document's contents into a node tree. However, for the XML engine, the node tree cannot be queried (traversed).
If you are using an XMLMap to import an XML document, the XML engine uses a Simple API for XML (SAX) model. SAX does not provide a random access lookup to the document's contents; it scans the document sequentially and presents each item to the application only one time.
Note that for large XML documents for which you are simply using the format type GENERIC or ORACLE, if you are having resource problems, convert to using an XMLMap, which uses the SAX model.
| Does the XML Engine Validate an XML Document? |
The XML engine does not validate an input XML document. The engine assumes that the data passed to it is in valid, well-formed XML format. Because the engine does not use a DTD (Document Type Definition) or SCHEMA, there is nothing to validate against.
| What Is the Difference between Using the XML Engine and the ODS MARKUP Destination? |
Typically, you use the XML engine to transport data, while the ODS MARKUP destination is used to create XML from SAS output. The XML engine creates and reads XML documents; ODS MARKUP creates but does not read XML documents.
| Why Do I Get Errors When Importing XML Documents Not Created with SAS? |
The XML engine reads only files that conform to the format types supported in the XMLTYPE= engine option. Attempting to import free-form XML documents that do not conform to the specifications required by the supported format types will generate errors. To successfully import files that do not conform to the XMLTYPE= format types, you can create a separate XML document, called an XMLMap. The XMLMap syntax tells the XML engine how to interpret the XML markup into SAS data set(s), variables (columns), and observations (rows).
An exception is the HTML format type, which is supported only for export.
See Importing XML Documents, Importing XML Documents Using an XMLMap, LIBNAME Statement Syntax, and Creating an XMLMap.
| Can I Use SAS Data Set Options with the XML Engine? |
Use SAS data set options with caution.
Note that while the LABEL= data set option no longer produces a warning message in the SAS log, the XML engine does not persist the information.
| Why Does an Exported XML Document Include White Space? |
The XML engine is in accordance with the Worldwide Web Consortium (W3C) specifications regarding handling white space, which basically states that it is often convenient to use white space (spaces, tabs, and blank lines) to set apart the markup for greater readability. An XML processor must always pass all characters in a document that are not markup through to the application. A validating XML processor must also inform the application which of these characters constitute white space appearing in element content.
When exporting an XML document, the XML engine adds a space (padding) to the front and end of each output XML element. Here is an example of an exported XML document that shows the white space.
<?xml version="1.0" encoding="windows-1252" ?>
- <TABLE>
- <CLASS>
<Name> Alfred </Name>
<Sex> M </Sex>
<Age> 14 </Age>
<Height> 69 </Height>
<Weight> 112.5 </Weight>
</CLASS>
The XML engine does not produce the special attribute xml:space for data elements but assumes default processing, which is to ignore leading and trailing white space.
You can remove the white space by specifying the SAS tagset TAGSETS.SASXMNSP. See Using a SAS Tagset to Remove White Spaces in Output XML Markup for an example.
Copyright © 2007 by SAS Institute Inc., Cary, NC, USA. All rights reserved.