CDISC ODM

Purpose

(Source: CDISC website http://www.cdisc.org/odm)
The CDISC ODM standard facilitates the archival and interchange of the metadata and data for clinical research. ODM is a vendor-neutral, platform-independent format for the interchange and archival of clinical study data. ODM includes the clinical data and its associated metadata, administrative data, reference data, and audit information. All of the information that needs to be shared during setup, operation, analysis, and submission, as well as for long-term retention as part of an archive, is included in ODM.

Release Dates

  • CDISC ODM, Version 1.3.0, December 15, 2006
  • CDISC ODM, Version 1.3.1, February 11, 2010

CDISC ODM 1.3.0 Reference Standard

The SAS Clinical Standards Toolkit supports this CDISC ODM 1.3.0 functionality:
  • reading and representing in SAS a complete odm.xml file (specific limitations are noted below)
  • building an odm.xml file from a SAS representation of the ODM standard
  • schema-level validating of an odm.xml file
  • validating the structure and content of the SAS representation of an odm.xml file
  • identifying unsupported (unrecognized) ODM elements and attributes by using a sample tool
  • extracting one or more data sets from the ClinicalData or ReferenceData sections of the ODM XML file
The SAS Clinical Standards Toolkit does not support this CDISC ODM 1.3.0 functionality:
  • reading or writing the DigitalSignatures section of the ODM
  • vendor or customer extensions of the ODM
  • processing is limited to a single ODM file (for example, the use of PriorFileOID to reference another file is ignored)
  • Full file metadata is expected in each file.
  • Effective support only for ODM FileType=Snapshot. The SAS Clinical Standards Toolkit makes no attempt to process multiple transactions per data point; multiple transactions are saved in the SAS ODM representation for subsequent processing
The domain and column metadata that constitute the SAS representation of CDISC ODM 1.3.0 are derived from the global standards library in these formats:
  • as empty data sets (using the utility macro %CST_CREATETABLESFORDATASTANDARD)
  • as table metadata (See CDISC ODM 1.3.0 reference_tables.)
  • as column metadata for 315 columns in the 66 data sets (reference_columns in the standard metadata folder)
As a general rule, the SAS representation of the CDISC ODM standard is patterned to match the XML element (data set) and attribute (column) structure of odm.xml. For example, consider this XML extract:
<ClinicalData StudyOID="P2006-101" MetadataVersionOID="101.01">
 <SubjectData SubjectKey="1000" TransactionType="Insert">
  <StudyEventData StudyEventOID="101.Screen">
   <FormData FormOID="101.DEMOG">
    <ItemGroupData ItemGroupOID="101.DM">
     <ItemDataString ItemOID="101.USUBJID">101-01-01</ItemDataString>
     <ItemDataString ItemOID="101.SEX">F</ItemDataString>
    </ItemGroupData>
   </FormData>
  </StudyEventData>
 </SubjectData>
</ClinicalData>
The following table describes how the XML element and attribute information maps to the SAS representation:
Sample Mapping of odm.xml File to SAS Representation
XML Element or Attribute
SAS Data Set
SAS Column
SAS Column Value
<ClinicalData StudyOID="P2006-101" MetadataVersionOID="101.01">
ClinicalData
StudyOID
MetaDataVersionOID
"P2006-101"
"101.01"
<SubjectData SubjectKey="1000" TransactionType="Insert">
SubjectData
SubjectKey
TransactionType
"1000"
"Insert"
<StudyEventData StudyEventOID="101.Screen">
StudyEventData
StudyEventOID
"101.Screen"
<FormData FormOID="101.DEMOG">
FormData
FormOID
"101.DEMOG"
<ItemGroupData ItemGroupOID="101.DM">
ItemGroupData
ItemGroupOID
"101.DM"
<ItemDataString ItemOID="101.USUBJID">101-01-01</ItemDataString>
ItemData
ItemOID
ItemDataType
Value
"101.USUBJID"
"ItemDataString"
"101-01-01"
<ItemDataString ItemOID="101.SEX">F</ItemDataString>
ItemData
ItemOID
ItemDataType
Value
"101.SEX"
"ItemDataString"
"F"
The following table lists the complete set of 66 tables that form the SAS Clinical Standards Toolkit SAS representation of the CDISC ODM 1.3.0 standard:
CDISC ODM 1.3.0 reference_tables
admindata
itemrangecheckvalues
annotation
itemrcformalexpression
annotationflag
itemrole
association
keyset
auditrecord
location
clinicaldata
locationversion
clitemdecodetranslatedtext
measurementunits
codelistitems
metadataversion
codelists
methoddefformalexpression
conditiondefformalexpression
methoddefs
conditiondefs
methoddeftranslatedtext
conditiondeftranslatedtext
mutranslatedtext
enumerateditems
odm
externalcodelists
presentation
formdata
protocoleventrefs
formdefarchlayouts
protocoltranslatedtext
formdefitemgrouprefs
rcerrortranslatedtext
formdefs
referencedata
formdeftranslatedtext
signature
imputationmethods
signaturedef
itemaliases
study
itemdata
studyeventdata
itemdefs
studyeventdefs
itemdeftranslatedtext
studyeventdeftranslatedtext
itemgroupaliases
studyeventformrefs
itemgroupdata
subjectdata
itemgroupdefitemrefs
user
itemgroupdefs
useraddress
itemgroupdeftranslatedtext
useraddressstreetname
itemmurefs
useremail
itemquestionexternal
userfax
itemquestiontranslatedtext
userlocationref
itemrangechecks
userphone
The highly structured nature of CDISC ODM data requires that any mapping to a relational format include a large number of data sets, with foreign key relationships to help preserve the intended non-relational object structure. In the SAS Clinical Standards Toolkit, foreign key relationships are enforced when validating the CDISC ODM data sets.
Field lengths in the CDISC ODM data sets are consistent by core data type. CDISC has not specified any limit to the length of most character fields. Arbitrary lengths have been chosen by data type. These lengths are listed in this table. In the table, standard data types are distilled into core data types. To be safe, larger lengths have been chosen to ensure that no data loss occurs in the SAS Clinical Standards Toolkit pre-installed data sets. Production tables might be compressed using SAS mechanisms to preserve disk space.
CDISC ODM Default Lengths by Data Type
Type Name
Length
Description
oid
64
A unique object identifier or a reference
text
2000
A character field that can accommodate a large number of characters
name
128
A descriptive identifier
value
512
An item of collected or reference data
path
512
An absolute or relative file system path or URL
The table metadata for the 66 data sets and the column metadata for the 315 columns in those data sets that comprise the SAS representation of the CDISC ODM 1.3.0 standard are here:
global standards library directory/standards/cdisc-odm-1.3.0-1.7/metadata
Table metadata is in reference_tables.sas7bdat, and column metadata is in reference_columns.sas7bdat.
Only the ODM data set, which contains valid values for the FileOID, CreationDateTime, and FileType variables, is needed to create a minimal, but valid, CDISC ODM-compliant XML document. This is based on the CDISC ODM standard, which is flexible. All table and column names are case sensitive. They must be specified exactly as shown.
In the SAS implementation of the relational data model, the keys are extended to define a unique record in every SAS data set. For example, a unique record in the EnumeratedItems data set is defined by the variables FK_CODELISTS and CODEDVALUE. These SAS data set keys are in the table metadata in the SAS reference_tables data set.
Starting in ODM 1.3.0, there are two forms of the ItemData element, which is the element used by ODM for transmitting clinical data item values. These two forms are untyped and typed. Here is an example of a typed ItemData element:
<ItemDataFloat ItemOID="ItemDef.OID.VS.VSSTRESN" TransactionType="Insert">76</ItemDataFloat>
Here is an example of an untyped ItemData element:
<ItemData ItemOID="ID.AETERM" Value="HEADACHE" />
Both of these data values are stored in the Value variable in the ItemData SAS data set. In the case of typed data, the ItemDataType variable in the ItemData SAS data set has the data type (for example, Float). In the case of untyped data, the ItemDataType variable in the ItemData SAS data set is null.
Typed and untyped data transmission should not be mixed within a single ODM file. However, in the example provided by the SAS Clinical Standards Toolkit, both types are part of the same example for demonstration purposes.
In the SAS Clinical Standards Toolkit, the CDISC ODM standard supports reading and representing in SAS a complete odm.xml file, and building an odm.xml file. The SAS Clinical Standards Toolkit validates both the structure and content of the SAS representation of each odm.xml file and the structural integrity of that file. The SAS Clinical Standards Toolkit also supports the extraction of subject or reference data for a data set (such as an SDTM AE domain) from the odm.xml file.
To support all of this functionality, supplemental files include these global standards library files:
  • A SAS format catalog (odmct.sas7bcat) in the formats folder provides valid values for selected columns in the 66 tables of the SAS representation.
  • The Messages data set in the messages folder provides error messaging for all Validation Master checks.
  • The Validation Master data set in the validation/control folder contains the superset of checks validating the structure and content of the 66 tables.
  • SAS code in the macros folder provides CDISC ODM-specific code that augments the code provided in the primary SAS Clinical Standards Toolkit autocall library (!sasroot/cstframework/sasmacro).
It is this set of files, in whole or in part, that defines the CDISC ODM 1.3.0 reference standard.

CDISC ODM 1.3.1 Reference Standard

The CDISC ODM 1.3.1 reference standard has the same functionality as CDISC ODM 1.3.0, with the following differences:
  • The SAS representation of CDISC ODM 1.3.1 includes 10 data sets in addition to those shown in CDISC ODM 1.3.0 reference_tables. The 10 additional data sets are listed in this table:
    Additional CDISC ODM 1.3.1 Tables Not Included with CDISC ODM 1.3.0
    codelistaliases
    formaliases
    codelistitemaliases
    methodaliases
    codelisttranslatedtext
    mualiases
    conditionaliases
    protocolaliases
    enumerateditemaliases
    studyeventaliases
  • The table metadata for these 76 data sets can be found in the reference_tables data set in the standard metadata folder. Column metadata for the 352 columns in these 76 data sets can be found in the reference_columns data set in the standard metadata folder.
This set of files, in whole or in part, defines the CDISC ODM 1.3.1 reference standard.