| Return to ODS Document
|
The ODS Document is a format for ODS output. Features of this format include:
The purpose of the ODS document is to enable a SAS user to rerender ODS output without rerunning procedures, and to give the user more control over the structure of the output than is available in previous versions of SAS.
The ODS document is preproduction in Version 8.2 and production in SAS 9.0.The beauty of ODS is that the user can easily generate SAS output in a variety of formats. This can happen in parallel -- you just specify as many formats as you want when you run your procedures.
It may turn out to be the case that you need to rerun the procedures one or more times. If you're developing a corporate style for SAS reports to be published to the Web, you'll be experimenting with colors and fonts, and thus rerunning procedures. Or perhaps you didn't anticipate that old-guard members of your organization would continue to demand SAS listing output instead of HMTL, so you have to rerun your job in "legacy mode".
Rerunning procedures simply to reformat their output is expensive in terms of computation and/or I/O. Data sets have to be rescanned, analyzes have to be reapplied, etc.. This assumes that it's even possible to rerun the job. If your job uses transient data, then you can't leverage ODS to reformat your output. You're stuck with what you've already generated.
This is where the ODS document comes in. The user may specify to ODS to create an ODS document when generating procedure output. That output will be stored in raw form in the named document, which is managed as a SAS library member. Subsequently the user can browse, edit and rerender (replay) output contained in the document using server-side applications provided especially for that purpose.
The document persists in the SAS system until it or the library containing it is deleted. This means that a document created in the WORK library persists no longer than the SAS session that created it. A document created in the SASUSER library might persist for days, weeks or even forever, in which case it could be considered a permanent archive of procedure output.
The format of the output stored in a document is neutral with respect to any other supported ODS formats. This insures a measure of forward compatibility with any as-yet-uninvented formats that ODS may support in the future. Users wanting to hedge their bets on the Next Big Standard Format will have that option with the ODS document.
In Version 6, procedures produce formatted output. In Versions 7 and following, procedures produce output objects. Output objects (tables, graphs, notes, equations) are handed over to ODS for rendering into the formats selected by the user. Each procedure organizes the output objects into a hierarchy. This output hierarchy is presented to users in two ways:
Currently ODS provides the user a great deal of control over the structure of individual output objects. However, the order in which output objects are formatted, and the grouping of output objects into hierarchies, are fixed by the procedure that generates them. Users don't like this inflexibility. They want to customize the arrangement of output objects.
The ODS document improves upon this situation. Users can create their own output hierarchies or modify existing ones. Output objects contained in an hierarchy can be ordered any way the user sees fit. It's even possible in some cases to create a customized hierarchy and ordering apart from any actual output objects.
Does the ODS document capture output perfectly? Does procedure output saved at one time always replay exactly that way later?
Not necessarily. Various factors influencing output fall outside of the document's domain. Among them are:
The document guarantees that all procedure data will be "memorized". There is no provision made for how that data will be presented. This is as it should be, since control of the presentation is the user's prerogative, not ODS'.
Storing the data in its entirety allows you to have your cake and eat it too. Even though your template definition may cause columns in a rendered table to be deleted or merged, you still have all of the data at your disposal.
Note also that the SAS Log is not stored in the document.
The ODS document is a type of SAS library member known as an item store. Various SAS resources are delivered as item stores, including ODS templates and the SAS Registry. Item stores have the extension sas7bitm (e.g. mydoc.sas7bitm) on most platforms.
The item store file format permits client applications to define a "hierarchical file system within a file". It supports arbitrarily nested subdirectories, as well as arbitrarily sized binary files called items. Items are stored in compressed form, and are automatically uncompressed when read.
An item store never shrinks in size, even when its contents are deleted. However, any space freed by deletion will be reused if and when new entries are added. If you need to reclaim space after deleting from a document, you can use PROC COPY to copy the document and delete the original. Alternatively, you can use PROC DOCUMENT or the Documents window to copy the document.
An ODS document is a hierarchy of output objects. The hierarchy is realized as a directory, which may contain named entries. An entry is one of:
Entry names must be alphanumeric (underscores allowed also), have no more than 32 characters, and begin with an alphabetic character. The case of entry names is preserved, meaning that mixed case is used for displaying entry names, but case is not significant when inputting entry names. (Note that this is the behavior of the Windows filesystem. The Unix filesystem differs in that input names are case-sensitive.)
The root directory is the top-level directory in a document. It is the only directory that is not contained within some other directory. Also, it is the only entry without a name.
Entry names are not required to be unique within a directory. However, entries are uniquely identifiable thanks to sequence numbers. Every entry in a document except for the root directory has a sequence number, which is a positive integer that is unique with respect to the name of the entry within the containing directory. Entries are assigned sequence numbers in accordance with the sequence in which they are added to a directory. The first entry named myname is assigned 1, the second is assigned 2, and so on. Sequence numbers are never reassigned, unless all entries of a given name are deleted, in which case the sequence "resets" with an initial number of 1.
The sequence number may be omitted when specifying an entry, in which case the most recently created entry having the name is selected. This defaulting works only if the most recently created entry still exists, i.e. it hasn't been deleted.
Entries may have labels, which must have no more than 256 characters.
The document supports three orderings of entry names:
By default client applications display directory contents in insertion order.
Insertion order is peculiar to the document. Generally filesystems don't support the ability to insert a new entry between two existing entries, for example. For them it doesn't make sense to allow that. In contrast, the document user wants to be able to insert an output object between two others. When replayed, those objects get rendered as tables sequenced just the way the user instructed.
As far as the procedures are concerned, insertion order is the same as date-time order. Each new entry is added "at the end", which is the same as in order of increasing date-time stamp.
Performance note: The document natively supports insertion and date-time orders. The underlying item store natively supports alphabetic order. The upshot is that no sorting occurs when reading entries out of a document.
An output object may have associated information. Some or all of these attributes may pertain to an output object:
An output object may have up to ten titles and up to ten footnotes.
Titles and footnotes are rendered each time a new page of output is started.
An output object may have up to ten subtitles.
An output object may have up to ten before notes, and may have up to ten after notes.
Here's the order in which the associated attributes of an output object render:
In order for titles, subtitles and footnotes to render, they must be in effect at the time a new page of output is started.
The output object's attributes are collectively called its context, since it records elements of the procedure and user context at the time the output object is created.
An output object's context moves with it. If you move/copy an output object from one document location to another, the context goes along with it. This can result in some, well .. interesting behavior. Suppose that you insert a copy of an output object NObs into a different directory, immediately before output object ErrorSSCP. (Remember, this is the document, so it makes sense to talk about inserting before a directory entry.) Suppose NObs has titles and a page break as its context. Suppose also that ErrorSSCP has a page break and different titles as its context. When NObs is rendered, first a page will be ejected, and then titles will be rendered, since titles are rendered on page boundaries. In the formatted output, the titles that move with NObs will effectively replace any that are part of the context of ErrorSSCP. The user may feel that the principle of least surprise has been violated when merely inserting an output object causes the titles on a page to change. Fortunately, the user can modify the context, so this is inconvenient behavior rather than restrictive behavior.
Replaying an output object with titles/footnotes does not side-effect any titles/footnotes active in the SAS session. By the same token, the active titles/footnotes do not influence the rendered output. This is the correct default behavior, although it would be a useful enhancement to permit the user to optionally specify that the active titles/footnotes override any titles/footnotes replayed from the document. This enhancement is planned for SAS 9.1.
Inline formatting in title and note text is honored by the document. See more information about in-line formatting in SAS 8.2. Or see the SAS 9.2 documentation for ODS ESCAPECHAR=.
The ODS document supports the linking of one entry (link) to another (target). The linking behavior is modeled after that of the Unix filesystem, which defines hard links and soft links. Links are full-fledged entries; they have their own names and labels, and can be inserted anywhere in a directory.
A hard link is a lightweight copy of an output object within a document. Other than names and labels, all data is shared between the link and the target.
A hard link and its target have independent lifetimes. Deleting a hard link doesn't affect the target, and vice-versa.
A soft link is a reference to a target. Soft links are commonly referred to as symbolic links in Unix, and as shortcuts in Windows.
The target of a soft link is not evaluated at the time the link is created. Rather it is evaluated at the time the link is accessed. This means that the target need not exist at the time the link is created. Likewise, the target need not be valid!
A soft link can reference a target that is within the same document as the link, or in a different document.
A soft link and its target have independent lifetimes. Be aware that deleting the target will cause the link to be a dangling reference, and any attempt to resolve an unbound link will fail.
Compared to a soft link, a hard link is less flexible but safer. It's less flexible because the target is limited to output objects, the link and target must reside within the same document, and the target must exist at the time the link is created. A hard link is safer because there is no opportunity for dangling references.
Links benefit users by offering the ability to share entries. Perhaps a user has multiple reports that all need to include one or more common tables. Or maybe the user has a report that wants to reference a table ANOVA that gets updated nightly by a batch job. He/she can set up a soft link to ANOVA in the directory where the job places the table. That directory and the directory containing the report can be in the same document or different documents.
Documents are not portable across platforms. A document created under Windows must be accessed under Windows, for instance.
A document is forward compatible across SAS versions, and backward compatible to the extent possible.
Text strings stored in an ODS document are not translatable. If you create a document with the Spanish version of SAS, you'll get Spanish text when you replay the document, regardless of locale.
We ODS developers jokingly refer to FREQ, PRINT and REPORT as "gnarly procedures". They produce tables that are too complex or idiosyncratic to be described with regular (rectangular) ODS templates. They require much special-purpose code. The level of support for the gnarly tables sometimes lags that for regular tables.
The level of support for the document among these procedures varies:
We hope to resolve these support issues in SAS 9.2.
The reporting procedures rely heavily on formats the user defines with PROC FORMAT. Without FORMAT, TABULATE is fairly useless.
With Version 8 we increased the reliance on FORMAT. TABULATE, PRINT and REPORT leverage FORMAT in ODS style overrides to support traffic lighting and other data-driven effects.
Usually user formats are temporary, meaning that they persist no longer than the SAS session. You can tell FORMAT to store user formats in a format catalog, from which they can be retrieved later.
Temporary formats pose a problem for the ODS document. When replaying output after the session that created the output ends, these formats are no longer available. If the output depends on these formats, you're out of luck. You can go and modify all of your jobs to use format catalogs, but who wants to do that?
To address this problem, the document will persist temporary user formats associated with ODS output. When subsequently you replay the output, it will restore the formats and give them to FORMAT to be loaded prior to output creation.
The document will not persist user formats that it believes to be permanent. If a format comes from a catalog in a library other than WORK, it won't get persisted. Also, it persists only formats that are defined by the data's metadata; it doesn't blindly persist every temporary format that happens to be loaded at the time.
Supporting graph procedures in the document raises similar issues as supporting user-defined formats. It's typical for a graph segment (GRSEG) to be output to a temporary catalog. A temporary catalog persists no longer than the SAS session that created it. However, the document needs for the the GRSEG to persist indefinitely. The user can always modify his/her graph procedure invocations to specify GOUT=, but who wants to do that?
The solution for SAS 9.0 is to add a new option, CAT=, to the ODS DOCUMENT statement. If a user specifies CAT= followed by a permanent catalog, the document will automatically copy any output temporary GRSEG to the specified permanent catalog, and will persist a reference to the permament GRSEG in the document. This has the same effect as specifying GOUT= for the procedure that output the GRSEG.
By default CAT= is not set, which means that temporary GRSEGs don't get copied to a permanent catalog. Once set, CAT= remains bound to the specified catalog until it is reset, or the ods document destination is closed. You can restore the default behavior by saying CAT=_NULL_.
Currently the ODS document does not support color maps, image maps or other graph accessories.
SAS 9.1 provides an alternative means of managing graphs in ODS documents that avoids the issues of GRSEG lifetime and catalog management. If one of the following GOPTIONS DEVICE values
is in effect when a graph created by a SAS/GRAPH procedure is stored in an ODS document, the graph's data and template will be stored. Instead of containing a reference to an external GRSEG, the persisted output object will be a fully self-describing internal graph that subsequently can be replayed to HTML, RTF, PDF, etc.
When storing a graph, it doesn't matter which of the above GOPTIONS DEVICE values is specified; the effect is the same. The values do matter when replaying the graph to one or more destinations. Here's how each behaves:
HTML is an example of a destination that supports applets.
The SAS 9.1 SAS/GRAPH applet is, or soon will be, available for downloading.
HTML and RTF are examples of destinations that support ActiveX controls.
The SAS 9.1 SAS/GRAPH ActiveX control is, or soon will be, available for downloading.You must use one of these GOPTIONS values when replaying an internal graph. Of course this assumes that you're replaying to a formatted destination such as PDF or RTF. GOPTIONS is ignored by unformatted destinations such as OUTPUT.
There are several advantages to using internal graphs rather than external graphs:There's a caveat associated with customizing graph style atributes. Any style settings from GOPTIONS statements, global statements, or PROC options, are stored with the graph in the ODS document, and take precedence over the ODS graph style that governs the graph's appearance. To retain maximum flexibility for customizing the style attibutes of internal graphs, it is recommended that customizations be specified only in the graph style.
Graph styles are defined using PROC TEMPLATE. You can learn how to use PROC TEMPLATE to create a custom graph style by consulting the PROC TEMPLATE documentation in the SAS Output Delivery System: User's Guide.
Although you can customize the style of an internal graph, you cannot change its structure. There's no way to convert a horizontal bar chart into a vertical bar chart, for instance. (Actually that's not quite true. The SAS/GRAPH applet and ActiveX control provide such functionality. But you can't do it for static images.)
Be mindful of the fact that this feature applies only to graphs generated by SAS/GRAPH procedures. Procedures such as UNIVARIATE (Base SAS) and SHEWHART (SAS/QC) will continue to generate only GRSEG graphs.
Regular expression support would be handy, especially for batch users.