Well-formed XML is determined
by structure, not content. Therefore, although the XML engine can
assume that the XML document is valid, well-formed XML, the engine
cannot assume that the root element encloses only instances of a single
node element (that is, only a single data set). Therefore, the XML
engine has to account for the possibility of multiple nodes (that
is, multiple SAS data sets).
For example, when the
following correctly structured XML document is imported, it is recognized
as containing two SAS data sets: HIGHTEMP and LOWTEMP.
<?xml version="1.0" encoding="windows-1252" ?>
<CLIMATE> 1
<HIGHTEMP> 2
<PLACE> Libya </PLACE>
<DATE> 1922-09-13 </DATE>
<DEGREE-F> 136 </DEGREE-F>
<DEGREE-C> 58 </DEGREE-C>
</HIGHTEMP>
.
. more instances of <HIGHTEMP>
.
<LOWTEMP> 3
<PLACE> Antarctica </PLACE>
<DATE> 1983-07-21 </DATE>
<DEGREE-F> -129 </DEGREE-F>
<DEGREE-C> -89 </DEGREE-C>
</LOWTEMP>
.
. more instances of <LOWTEMP>
.
</CLIMATE>
When the previous XML document is imported, the following
happens:
1 |
The
XML engine recognizes the first instance tag <CLIMATE> as the
root-enclosing element, which is the container for the document.
|
2 |
Starting
with the second-level instance tag, which is <HIGHTEMP>, the
XML engine uses the repeating element instances as a collection of
rows with a constant set of columns.
|
3 |
When
the second-level instance tag changes, the XML engine interprets that
change as a different SAS data set.
|
The result is two SAS
data sets: HIGHTEMP and LOWTEMP. Both happen to have the same variables
but different data.
To ensure that an import
result is what you expect, use the DATASETS procedure. For example,
these SAS statements result in the following:
libname climate xml 'C:\My Documents\climate.xml';
proc datasets library=climate;
quit;
DATASETS Procedure Output for CLIMATE Library