Importing XML Documents |
This example imports the following XML document, which was created from a Microsoft Access database. Because the XML document contains an embedded XML schema, you must specify the MSACCESS markup type rather than the default GENERIC type. MSACCESS obtains a variable's attributes from the embedded schema.
<?xml version="1.0" encoding="windows-1252" ?> <root xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:od="urn:schemas-microsoft-com:officedata"> <xs:schema> <xs:element name="dataroot"> <xs:complexType> <xs:sequence> <xs:element ref="SUPPLIERS" minOccurs="0" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="SUPPLIERS"> <xs:complexType> <xs:sequence> <xs:element name="HOMEPAGE" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="94" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="FAX" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="15" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="PHONE" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="15" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="COUNTRY" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="11" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="POSTALCODE" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="8" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="REGION" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="8" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="CITY" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="13" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="ADDRESS" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="45" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="CONTACTTITLE" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="28" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="CONTACTNAME" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="26" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="COMPANYNAME" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="38" /> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="SUPPLIERID" minOccurs="0" od:jetType="double" od:sqlSType="double" type="xs:double" /> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> <dataroot> <SUPPLIERS> <HOMEPAGE/> <FAX/> <PHONE>(272) 444-2222</PHONE> <COUNTRY>UK</COUNTRY> <POSTALCODE>EC1 4SD</POSTALCODE> <REGION/> <CITY>London</CITY> <ADDRESS>49 Franklin St.</ADDRESS> <CONTACTTITLE>Purchasing Manager</CONTACTTITLE> <CONTACTNAME>Charlotte Smith</CONTACTNAME> <COMPANYNAME>Exotic Flowers</COMPANYNAME> <SUPPLIERID>1</SUPPLIERID> </SUPPLIERS> <SUPPLIERS> <HOMEPAGE>#MYCAJUN.HTM#</HOMEPAGE> <FAX/> <PHONE>(512) 284-3677</PHONE> <COUNTRY>USA</COUNTRY> <POSTALCODE>70117</POSTALCODE> <REGION>LA</REGION> <CITY>New Orleans</CITY> <ADDRESS>P.O. Box 78934</ADDRESS> <CONTACTTITLE>Order Administrator</CONTACTTITLE> <CONTACTNAME>Shelley Martin</CONTACTNAME> <COMPANYNAME>New Orleans Cajun Foods</COMPANYNAME> <SUPPLIERID>2</SUPPLIERID> </SUPPLIERS> . . . </dataroot> </root>
The following SAS program interprets the XML document as a SAS data set:
libname access xml '/u/myid/myfiles/suppliers.xml' xmltype=msaccess 1 xmlmeta=schemadata; 1 proc print data=access.suppliers (obs=2); 2 var contactname companyname; run;
The LIBNAME statement assigns the libref ACCESS to the physical location of the XML document (complete pathname, filename, and file extension) and specifies the XML engine. By default, the XML engine expects GENERIC markup, so you must include the XMLTYPE= option in order to read the XML document in MSACCESS markup and to obtain a variable's attributes from the embedded schema. The option XMLMETA=SCHEMADATA specifies to import both data and metadata-related information from the input XML document.
The PRINT procedure produces the output. The procedure uses the OBS= data set option to print only the first two observations and the VAR statement to print only specific variables (columns).
PROC PRINT Output for ACCESS.SUPPLIERS
The SAS System 1 Obs CONTACTNAME COMPANYNAME 1 Charlotte Smith Exotic Flowers 2 Shelley Martin New Orleans Cajun Foods
Using PROC CONTENTS, the output displays the file's attributes as well as the attributes of each interpreted column (variable), such as the variable's type and length, which are obtained from the embedded XML schema. Without the embedded XML schema, the results for the attributes would be default values.
proc contents data=access.suppliers; run;
PROC CONTENTS Output for ACCESS.SUPPLIERS
The SAS System 2 The CONTENTS Procedure Data Set Name ACCESS.SUPPLIERS Observations . Member Type DATA Variables 12 Engine XML Indexes 0 Created . Observation Length 0 Last Modified . Deleted Observations 0 Protection Compressed NO Data Set Type Sorted NO Label Data Representation Default Encoding Default Alphabetic List of Variables and Attributes # Variable Type Len Format Informat Label 8 ADDRESS Char 45 $45. $45. ADDRESS 7 CITY Char 13 $13. $13. CITY 11 COMPANYNAME Char 38 $38. $38. COMPANYNAME 10 CONTACTNAME Char 26 $26. $26. CONTACTNAME 9 CONTACTTITLE Char 28 $28. $28. CONTACTTITLE 4 COUNTRY Char 11 $11. $11. COUNTRY 2 FAX Char 15 $15. $15. FAX 1 HOMEPAGE Char 94 $94. $94. HOMEPAGE 3 PHONE Char 15 $15. $15. PHONE 5 POSTALCODE Char 8 $8. $8. POSTALCODE 6 REGION Char 8 $8. $8. REGION 12 SUPPLIERID Num 8 F8. F8. SUPPLIERID
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.