Special Topic: Identifying Unsupported Elements and Attributes in a CDISC ODM File

Overview

Note: The process explained below is the same for all ODM versions that are supported by the SAS Clinical Standards Toolkit. The process is explained using ODM version 1.3.0.
In practice, vendor and custom extensions to ODM are common. For example, Electronic Data Capture (EDC) vendors use data management features and flags that might be exported using ODM XML extensions. By default, such extensions are ignored by the SAS Clinical Standards Toolkit. Recall that the SAS Clinical Standards Toolkit uses XSL style sheets for each of the default, supported 66 ODM data sets (such as ItemDefs.xsl). These style sheets look for specifically named tags and hierarchical paths based on the CDISC ODM 1.3.0 published specification. If elements or attributes exist in the XML file but not in the specification, they are ignored.
For example, in this XML code fragment, note the Vendor:<name> syntax. This represents a hypothetical extension to the ODM XML, presumably accompanied by a namespace reference supporting the Vendor naming convention.
<FormData FormOID=" FormDefs.OID.Death" FormRepeatKey="00-01"
             TransactionType="Remove" Vendor:Revised="No">
        <Vendor:DataQuery DQOID="DQ.OID.001"
                         QueryText="Premature report of patients demise?">
             <Flag>Y</Flag>
              <AuditRecord>
                <UserRef UserOID="User.OID.I024" />
                <LocationRef LocationOID="Location.OID.S001" />
                <DateTimeStamp>2011-01-24T15:13:22</DateTimeStamp>
              </AuditRecord>
        </Vendor:DataQuery>
    </FormData>
In this code fragment, the Vendor:DataQuery syntax specifies a new element with several new attributes and references to other existing (supported) elements. Note also the additional Vendor:Revised attribute for FormData.
The SAS Clinical Standards Toolkit provides a utility macro to parse the ODM XML file to identify currently unsupported elements and tags. This macro, cstutil_readxmltags, is located in the primary SAS Clinical Standards Toolkit autocall library (!sasroot/cstframework/sasmacro).
Here is an example of a call to the cstutil_readxmltags macro:
%cstutil_readxmltags(
      _cstxmlfilename=inxml
     ,_cstxmlreporting=Dataset
     ,_cstxmlelementds=work.cstodmelements
     ,_cstxmlattrds=work.cstodmattributes);
In this call, the XML file to be parsed is specified with the inxml fileref. The results of the parsing are to be written to two data sets, work.cstodmelements for all unique elements found in the XML file and work.cstodmattributes for all unique attributes found associated each unique element.
The cstutil_readxmltags macro parameters are described in this table.
Parameters for the cstutil_readxmltags.sas Macro
Parameter
Required
Description
_cstxmlfilename
Yes
Fileref for input XML file.
_cstxmlreporting
Yes
How results are to be reported. Valid values: Dataset or Results.
If Dataset is specified, these two parameters are referenced.
If Results is specified, differences detected are reported in the process results data set (as defined by the &_cstResultsDS global macro variable).
_cstxmlelementds
No
Libref.dataset for file elements. Default=work.cstodmelements
_cstxmlattrds
No
Libref.dataset for file attributes. Default=work.cstodmattributes
See the macro header for more details about current assumptions and limitations.

Sample Utility Program: find_unsupported_tags.sas

Overview

The SAS Clinical Standards Toolkit provides a utility program, find_unsupported_tags.sas, to demonstrate assessment of the ODM XML file elements and attributes. This program is located in:
sample study library directory/cdisc-odm-1.3.0–1.5/programs
This program provides the same process setup function supported in most SAS Clinical Standards Toolkit driver modules, building a SASReferences data set that defines process inputs and outputs, and allocating all SAS librefs and filerefs.
Here is the general workflow of this utility program:
  1. Build a process-specific SASReferences data set.
  2. Call the %cstutil_processsetup() macro to set process paths and perform required library and file allocations.
  3. Call the cstutil_readxmltags macro to create a data set of element names and a data set of attribute names.
  4. Compare elements and attributes to a set of known (for example, supported) elements and attributes.
  5. Report discrepancies.

The SASReferences Data Set

As a part of each SAS Clinical Standards Toolkit process setup, a valid SASReferences data set is required. It references the input files that are needed, the librefs and filenames to use, and the names and locations of data sets to be created by the process. It can be modified to point to study-specific files. For an explanation of the SASReferences data set, see SASReferences File.
In the SASReferences data set, three input references and one output reference are key to successful completion of the find_unsupported_tags.sas utility program. Key Components of the SASReferences Data Set for the find_unsupported_tags.sas Macro lists these files and data sets, and they are discussed in separate sections.
In the sample find_unsupported_tags.sas utility program, these values are set for &studyRootPath and &studyOutputPath:
&studyRootPath=sample study library directory/cdisc-odm-1.3.0–1.5
&studyOutputPath=sample study library directory/cdisc-odm-1.3.0–1.5
Key Components of the SASReferences Data Set for the find_unsupported_tags.sas Macro
Metadata Type
SAS LIBNAME or Fileref to Use
Reference Type
Path
Name of File
Input
externalxml
odmxml
fileref
&studyRootPath/sourcexml
odm_extended.xml
standardmetadata(element)
odmmeta
libref
standardmetadata(attribute)
odmmeta
libref
Output
results
results
libref
&studyOutputPath/results
readxmltags_results.sas7bdat

Process Inputs

The metadata type externalxml refers to the ODM XML file that is being read. The filename odmxml is defined in the SASReferences data set. This filename is used in the submitted SAS code when referring to the XML file. The ODM XML file odm_extended.xml contains sample extensions to the core ODM 1.3.0 model.
The metadata type standardmetadata, referenced by the odmmeta SAS libref, references the global standards library directory/standards/cdisc-odm-1.3.0-1.5/metadata folder. This folder includes the two data sets valid_elements and valid_attributes, which contain the full list of ODM core elements and attributes supported by the SAS Clinical Standards Toolkit. The valid_elements data set contains a single column element itemizing the ODM core elements. The valid_attributes data set contains each attribute within the context of its parent tag and containing element.
This display provides a partial listing of the valid_attributes data set.
Partial Listing of the valid_attributes Data Set
Partial listing of the valid_attributes data set

Process Outputs

The results type refers to the Results data set that contains information from running the process. In the SAS Clinical Standards Toolkit sample code hierarchy, this information is written to the sample study library directory/cdisc-odm-1.3.0–1.5/results directory. This location is represented in the utility program by the Results library name.
Depending on the parameter values associated with the call to the cstutil_readxmltags macro, two additional process outputs might be persisted at the conclusion of the process. If the _cstxmlreporting parameter is set to Dataset, any unsupported elements are documented in the data set referenced by the _cstxmlelementds parameter and any unsupported attributes are documented in the data set referenced by the _cstxmlattrds parameter.

Process Results

When the utility program finishes running, the readxmltags_results data set is created in the Results library. This data set contains informational, warning, and error messages that were generated by the submitted utility program.
This display shows an example of the contents of a Results data set run against the customized odm_extended.xml input file (with the _cstxmlreporting parameter set to Results).
Example of a Partial Results Data Set Created by find_unsupported_tags.sas
Partial results data set created by find_unsupported_tags.sas