FOCUS AREAS

SAS Output Delivery System (ODS) documentation

Base SAS

XML, ODS, and the MARKUP Destination


Overview

This document provides a brief introduction to XML and discusses how XML can be integrated with SAS applications. XML is a tag-based language whose syntax is similar to HTML. But unlike HTML, XML is an extensible markup language. You can define your own tags and structural relationships. Attempting to read or import the underlying data of an HTML file would be a difficult task with all of the formatting tags. In contrast, XML tags are content-based and thus help to organize the data for reading and writing. This makes XML a non-proprietary, machine-independent way to exchange data, like the Electronic Data Interchange (EDI) standard. The data can be easily moved from machine to machine, displayed as HTML, and transformed to other formats.

The future looks bright for XML. It is already heavily used by many software and database vendors and is a large part of Microsoft's new .Net strategy.

See also XML LIBNAME engine for detailed information about using XML with SAS.

Inside an XML File

The data in an XML document is stored in a treelike structure, which allows for hierarchical relationships to be defined between data. The document is composed of elements, which are sometimes referred to as nodes. Each XML document always starts off with a unique first element which acts as a root node. By nesting elements inside others, you create relationships. The below example is an XML file created with the SASHELP.CLASS data set. The root node is <Table> which only occurs once in the XML file. Here are two records from the file:

  <?xml version="1.0" encoding="windows-1252" ?>
- <TABLE>
-  <CLASS>
     <Name>Alfred</Name>
     <Sex>M</Sex>
     <Age>14</Age>
     <Height>69</Height>
     <Weight>112.5</Weight>
   </CLASS>
- <CLASS>
     <Name>Alice</Name>
     <Sex>F</Sex>
     <Age>13</Age>
     <Height>56.5</Height>
     <Weight>84</Weight>
  </CLASS>
 

Each element's tag name describes the value inside the tag. In the above example, the <Name> tag enclose the person's name, <Height> is the person's height, and so on. You can also have attributes within the tag to further identify the data. They are good because you can set defaults.

   <Age label="age of student"> 19 </Age>

All XML elements are case sensitive and must be well formed, because the syntax rules are much more strict than in HTML. Many XML authoring tools provide a syntax checker.

Schemas, DTDs, and XSL

A schema is a way of self-documenting an XML file. It's like a PROC CONTENTS of XML files. A DTD is much the same, but schemas are newer and will eventually replace DTDs. Creating a schema is much like planning to create a database.

XSL stands for extensible style sheet language. XSL is the formatting property for XML, as cascading style sheets (CSS) is the formatting property for HTML. A CSS can be used with XML files, but XSL is much more powerful. XSL is mainly used to transform the XML document to a format that can be displayed, such as HTML. Basically, what XSL does is to allow you to create templates that specify the presentation of the XML document. Compared to CSS, XSL has the ability to

The below XSL style sheet could be used to transform an XML file created from the SASHELP.CLASS data set to an HTML file. The FOR-EACH SELECT code performs a loop through each node and selects the value for that node with the VALUE-OF SELECT statement. This is probably the most widely used function. See the full XSL style sheet and view the XML file with the XSL style sheet applied. For more information about XSL, see the w3.org Web page that is referenced in the first <xsl> tag below.

<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
<xsl:template match="/">
  <html>
  <body>
    <p align="center">Transforming XML file using XSL</p>
    <p></p>
    <table border="2" align="center" bgcolor="#E0E0E0">
      <tr>
        <th>Name</th>
        <th>Age</th>
        <th>Sex</th>
        <th>Height</th>
        <th>Weight</th>
      </tr>
      <xsl:for-each select="TABLE/CLASS">
      <tr>
        <td><xsl:value-of select="Name"/></td>
        <td><xsl:value-of select="Age"/></td>
        <td><xsl:value-of select="Sex"/></td>
        <td><xsl:value-of select="Height"/></td>
        <td><xsl:value-of select="Weight"/></td>
      </tr>
      </xsl:for-each>
    </table>
  </body>
  </html>

How to Generate XML Files in SAS

XML has been fully integrated into the SAS System via the XML engine and the ODS XML and ODS MARKUP statements.

Using the XML Engine

Below is example code that illustrates usage of the XMLTYPE= option.

   libname newfile xml 'c:\temp.xml'
           xmltype=export;
   data newfile.test;
      set sashelp.class;
   run;

Here are values for some of the options. (See SAS documentation for full syntax and XML LIBNAME engine for updates.)

XMLTYPE=EXPORT
is the same as specifying XMLTYPE=OIMDBM and also is the most common format.

XMLTYPE=ORACLE
generates an XML file that is equivalent to the XML files that are created by Oracle.

XMLTYPE=HTML
generates an HTML file.

XMLTYPE=GENERIC
is the default, which generates a generic XML file.

XMLSHEMA=YES
is the same XMLTYPE=EXPORT but adds column headers.

TAGSET=
specifies a SAS tagset, or a user-defined tagset that was created with the TEMPLATE procedure. When you use the TAGSET= option, there is no need to specify the XMLTYPE= option.

Using the MARKUP Destination

The ODS MARKUP statement is experimental in Release 8.2 and production in Version 9 and is a very powerful vehicle in developing the type of output that you want. To specify the desired destination, issue the TAGSET= option on the ODS MARKUP statement. Just as you have been using PROC TEMPLATE to create or modify style and table templates, beginning in Release 8.2 you can use PROC TEMPLATE to work with tagset templates.

See Using ODS to Export Output in a Markup Language or SAS documentation for syntax and usage.

How to Modify Existing Tagsets with the DEFINE EVENT Statement

One of the most powerful features of ODS MARKUP and the XML engine is your ability to create or modify tagsets by using PROC TEMPLATE. Before you try to modify the tagsets, familiarize yourself with the syntax, which is available in the SAS online documentation or the ODS MARKUP Resources.

Here is an example that uses PUTQ, which places quotation marks around all variable values. The following code produces the output BACKGROUND=" BLUE ":

   putq "background= " background;

Here are some examples that use conditionals IF, WHEN, EXIST, ANY. The following PUT statements write the value of the variable FOREGROUND if it has a value:

   put  foreground /  exist(foreground)
   putq "color= " foreground /  when exists(foreground)
   putl foreground /  if exist(foreground)

The following PUT statements write either or all variables if they have a value:

   put foreground  /  any(foreground,background)
   put "condition is true "  /  when any(foreground,background)
   put "condition is true " /  if exist(foreground,background)

The next examples use comparisons CMP and !CMP. The following writes the variable FOREGROUND if the variable has a value of BLUE:

   put foreground /  cmp(foreground,'blue');

The following writes the variable FOREGROUND with quotation marks only if the value is not RED:

   putq "color= " foreground /  !cmp(foreground,"red");

The following writes the variable FOREGROUND on a new line only if the variable has a value of BLUE:

   putl foreground /  cmp(foreground,"blue");

Triggers, Events, and States

A trigger fires off other events from within an event and returns to the event from where it was called. This is similar to the LINK statement within the DATA step. Inherent triggers are used without using the TRIGGER statement to call the next event. Below is an example of an event triggering another event.

Most events have a state, which is either start or finish. The start tells the event how to begin to process and the finish tells it how to end. A very few simple events are stateless.

DEFINE EVENT Example

The below event provides the opening and closing HTML tags for a frame file. The start determines what happens when this event is triggered. The finish determines what happens when the event is completed. HTMLDOCTYPE is written first, and then three lines are skipped, which is indicated by the three NL's. Then come the opening HTML tag, and the closing HMTL tag when the event finishes.

   define event top_file;
      start:
      put HTMLDOCTYPE NL NL NL;
      put "<html>" NL;
      put "<!-- Generated by SAS Software -->" NL;
      put "<!-- Http://www.sas.com -->" NL;
      finish:
      put "</html>" NL;
   end;

To follow the flow of events, use the Event_Map template. This helps you step through the tagset. See the SAS documentation for more information.

   ods markup file='temp.xml'  tagset=event.map;
   proc print data=sashelp.class;
   run;
   ods markup close;