SAS Institute. The Power to Know

SAS(R) 9.2 XML LIBNAME Engine: User's Guide

space
Previous Page | Next Page

Importing XML Documents

Importing an XML Document Created by Microsoft Access

This example imports the following XML document, which was created from a Microsoft Access database. Because the XML document contains an embedded XML schema, you must specify the MSACCESS markup type rather than the default GENERIC type. MSACCESS obtains a variable's attributes from the embedded schema.

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" 
  xmlns:od="urn:schemas-microsoft-com:officedata">
<xsd:schema>
<xsd:element name="dataroot">
<xsd:complexType>
<xsd:choice maxOccurs="unbounded">
<xsd:element ref="Suppliers"/>
</xsd:choice>
</xsd:complexType>
</xsd:element>
<xsd:element name="Suppliers">
<xsd:annotation>
<xsd:appinfo>
<od:index index-name="PrimaryKey" index-key="SupplierID " primary="yes" 
   unique="yes" clustered="no"/>
<od:index index-name="CompanyName" index-key="CompanyName " primary="no" 
   unique="no" clustered="no"/>
<od:index index-name="PostalCode" index-key="PostalCode " primary="no" 
   unique="no" clustered="no"/>
</xsd:appinfo>
</xsd:annotation>
<xsd:complexType>
<xsd:sequence>
<xsd:element name="SupplierID" od:jetType="autonumber" od:sqlSType="int" 
   od:autoUnique="yes" od:nonNullable="yes">
<xsd:simpleType>
<xsd:restriction base="xsd:integer"/>
</xsd:simpleType>
</xsd:element>
<xsd:element name="CompanyName" minOccurs="0" od:jetType="text" 
   od:sqlSType="nvarchar">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:maxLength value="40"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="ContactName" minOccurs="0" od:jetType="text" 
   od:sqlSType="nvarchar">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:maxLength value="30"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
   .
   .
   .
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
<dataroot xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance">
<Suppliers>
<SupplierID>1</SupplierID>
<CompanyName>Exotic Flowers</CompanyName>
<ContactName>Charlotte Smith</ContactName>
<ContactTitle>Purchasing Manager</ContactTitle>
<Address>49 Franklin St.</Address>
<City>London</City>
<PostalCode>EC1 4SD</PostalCode>
<Country>UK</Country>
<Phone>(272) 444-2222</Phone>
</Suppliers>
<Suppliers>
<SupplierID>2</SupplierID>
<CompanyName>New Orleans Cajun Foods</CompanyName>
<ContactName>Shelley Martin</ContactName>
<ContactTitle>Order Administrator</ContactTitle>
<Address>P.O. Box 78934</Address>
<City>New Orleans</City>
<Region>LA</Region>
<PostalCode>70117</PostalCode>
<Country>USA</Country>
<Phone>(512) 284-3677</Phone>
<HomePage>#MYCAJUN.HTM#</HomePage>
</Suppliers>
   .
   .
   .
</dataroot>
</root>

The following SAS program interprets the XML document as a SAS data set:

libname access xml '/u/myid/myfiles/suppliers.xml' xmltype=msaccess 1 
   xmlmeta=schemadata; 1 

proc print data=access.suppliers (obs=2); 2 
   var companyname contactname;
run;

  1. The LIBNAME statement assigns the libref ACCESS to the physical location of the XML document (complete pathname, filename, and file extension), and specifies the XML engine. By default, the XML engine expects GENERIC markup, so you must include the XMLTYPE= option in order to read the XML document in MSACCESS markup and to obtain a variable's attributes from the embedded schema. The option XMLMETA=SCHEMADATA specifies to import both data and metadata-related information from the input XML document.

  2. The PRINT procedure produces the output. The procedures uses the OBS= data set option to print only the first two observations and the VAR statement to print only specific variables (columns).

PROC PRINT Output for ACCESS.SUPPLIERS

                            The SAS System                                    1

Obs    COMPANYNAME                               CONTACTNAME

  1    Exotic Flowers                            Charlotte Smith
  2    New Orleans Cajun Foods                   Shelley Martin

Using PROC CONTENTS, the output displays the file's attributes as well as the attributes of each interpreted column (variable), such as the variable's type and length, which are obtained from the embedded XML schema. Without the embedded XML schema, the results for the attributes would be default values.

proc contents data=access.suppliers;
run;

PROC CONTENTS Output for ACCESS.SUPPLIERS

  
                               The SAS System                                       2

                           The CONTENTS Procedure

      Data Set Name        ACCESS.SUPPLIERS    Observations          .
      Member Type          DATA                Variables             12
      Engine               XML                 Indexes               0
      Created              .                   Observation Length    0
      Last Modified        .                   Deleted Observations  0
      Protection                               Compressed            NO
      Data Set Type                            Sorted                NO
      Label
      Data Representation  Default
      Encoding             Default


                 Alphabetic List of Variables and Attributes

    #    Variable        Type    Len    Format    Informat    Label

    5    Address         Char     60    $60.      $60.        Address
    6    City            Char     15    $15.      $15.        City
    2    CompanyName     Char     40    $40.      $40.        CompanyName
    3    ContactName     Char     30    $30.      $30.        ContactName
    4    ContactTitle    Char     30    $30.      $30.        ContactTitle
    9    Country         Char     15    $15.      $15.        Country
   11    Fax             Char     24    $24.      $24.        Fax
   12    HomePage        Char    256    $256.     $256.       HomePage
   10    Phone           Char     24    $24.      $24.        Phone
    8    PostalCode      Char     10    $10.      $10.        PostalCode
    7    Region          Char     15    $15.      $15.        Region
    1    SupplierID      Num       8    F8.       F8.         SupplierID

space
Previous Page | Next Page | Top of Page