As the primary interchange
format for CDISC, ODM XML is a common format for electronic data capture
(EDC) data management views of clinical data. This format often does
not closely approximate submission (SDTM) and analysis (ADaM) data
structures unless the EDC views have been built using the CDISC-CDASH
standard. From a SAS perspective, you might want to extract clinical
data from an ODM XML file to serve as source data for transformations
that derive SDTM domain data sets.
The odm_extractdomaindata
macro supports extracting clinical data or reference data from the
SAS data sets that were created by the odm_read macro.
The odm_extractdomaindata
macro makes the following assumptions:
-
An ODM XML file is available that
contains sufficient metadata and content for extractable clinical
data and reference data.
-
A full SAS representation of an
ODM XML file is available (for example, the odm_read macro has been
run against the XML file).
-
The SAS representation of an ODM
XML file contains both metadata and data.
By default, the driver
assumes all source data files reside in the sample derived folder
or the data folder that is typically populated by running the odm_read
macro. However, the source data files and the source metadata files
can be in different folders.
-
Any codelists defined in the ODM
XML file and associated with extracted data set columns are available
as part of the output of the odm_read macro.
ODM integer and float
data types are converted to SAS numeric data. All other ODM data types
are converted to SAS character data. If an integer or float data value
cannot be converted, a warning appears in the SAS log and Results
data set.
Here is a partial listing
of the metadata in a sample ODM XML file:
<ItemGroupDef OID="ItemGroupDefs.OID.AE" Repeating="Yes"
SASDatasetName="AE" Name="Adverse Events" Domain="AE"
Comment="Some adverse events from this trial">
<ItemRef ItemOID="ID.TAREA" OrderNumber="1" Mandatory="No" />
<ItemRef ItemOID="ID.PNO" OrderNumber="2" Mandatory="No" />
<ItemRef ItemOID="ID.SCTRY" OrderNumber="3" Mandatory="No" />
<ItemRef ItemOID="ID.F_STATUS" OrderNumber="4" Mandatory="No" />
<ItemRef ItemOID="ID.LINE_NO" OrderNumber="5" Mandatory="No" />
<ItemRef ItemOID="ID.AETERM" OrderNumber="6" Mandatory="No" />
<ItemRef ItemOID="ID.AESTMON" OrderNumber="7" Mandatory="No" />
<ItemRef ItemOID="ID.AESTDAY" OrderNumber="8" Mandatory="No" />
<ItemRef ItemOID="ID.AESTYR" OrderNumber="9" Mandatory="No" />
<ItemRef ItemOID="ID.AESTDT" OrderNumber="10" Mandatory="No" />
<ItemRef ItemOID="ID.AEENMON" OrderNumber="11" Mandatory="No" />
<ItemRef ItemOID="ID.AEENDAY" OrderNumber="12" Mandatory="No" />
<ItemRef ItemOID="ID.AEENYR" OrderNumber="13" Mandatory="No" />
<ItemRef ItemOID="ID.AEENDT" OrderNumber="14" Mandatory="No" />
<ItemRef ItemOID="ID.AESEV" OrderNumber="15" Mandatory="No" />
<ItemRef ItemOID="ID.AEREL" OrderNumber="16" Mandatory="No" />
<ItemRef ItemOID="ID.AEOUT" OrderNumber="17" Mandatory="No" />
<ItemRef ItemOID="ID.AEACTTRT" OrderNumber="18" Mandatory="No" />
<ItemRef ItemOID="ID.AECONTRT" OrderNumber="19" Mandatory="No" />
</ItemGroupDef>
...
<ItemDef OID="ID.AESTDT" SASFieldName="AESTDT"
Name="Derived Start Date" DataType="date"/>
<ItemDef OID="ID.AEENMON" SASFieldName="AEENMON"
Name="Stop Month - Enter Two Digits 01-12" DataType="integer" Length="2" />
<ItemDef OID="ID.AEENDAY" SASFieldName="AEENDAY"
Name="Stop Day - Enter Two Digits 01-31" DataType="integer" Length="2" />
<ItemDef OID="ID.AEENYR" SASFieldName="AEENYR"
Name="Stop Year - Enter Four Digit Year" DataType="integer" Length="4" />
<ItemDef OID="ID.AEENDT" SASFieldName="AEENDT"
Name="Derived Stop Date" DataType="date"/>
<ItemDef OID="ID.AESEV" SASFieldName="AESEV"
Name="Severity” DataType="text" Length="1">
<CodeListRef CodeListOID="CL.$AESEV" />
</ItemDef>
<ItemDef OID="ID.AEREL" SASFieldName="AEREL"
Name="Relationship to study drug" DataType="text" Length="1">
<CodeListRef CodeListOID="CL.$AEREL" />
</ItemDef>
Here is a partial listing
of the data in the same sample ODM XML file:
<ClinicalData StudyOID="Study.OID" MetaDataVersionOID="MetaDataVersion.OID.1">
<SubjectData SubjectKey="S001P011" TransactionType="Insert">
<StudyEventData StudyEventOID="StudyEventDefs.OID.6.AdverseEvent"
StudyEventRepeatKey="1">
<FormData FormOID="FormDefs.OID.AE" FormRepeatKey="1">
<ItemGroupData ItemGroupOID="ItemGroupDefs.OID.AE"
ItemGroupRepeatKey="1">
<ItemData ItemOID="ID.TAREA" Value="ONC" />
<ItemData ItemOID="ID.PNO" Value="143-02" />
<ItemData ItemOID="ID.SCTRY" Value="USA" />
<ItemData ItemOID="ID.F_STATUS" Value="V" />
<ItemData ItemOID="ID.LINE_NO" Value="1" />
<ItemData ItemOID="ID.AETERM" Value="HEADACHE" />
<ItemData ItemOID="ID.AESTMON" Value="06" />
<ItemData ItemOID="ID.AESTDAY" Value="10" />
<ItemData ItemOID="ID.AESTYR" Value="1999" />
<ItemData ItemOID="ID.AESTDT" Value="1999-06-10" />
<ItemData ItemOID="ID.AEENMON" Value="06" />
<ItemData ItemOID="ID.AEENDAY" Value="14" />
<ItemData ItemOID="ID.AEENYR" Value="1999" />
<ItemData ItemOID="ID.AEENDT" Value="1999-06-14" />
<ItemData ItemOID="ID.AESEV" Value="1" />
<ItemData ItemOID="ID.AEREL" Value="0" />
<ItemData ItemOID="ID.AEOUT" Value="1" />
<ItemData ItemOID="ID.AEACTTRT" Value="0" />
<ItemData ItemOID="ID.AECONTRT" Value="1" />
</ItemGroupData>
AE SAS Data Set (Unformatted) Created by the odm_extractdomaindata
Macro
AE SAS Data Set (Formatted) Created by the odm_extractdomaindata
Macro
The odm_extractdomaindata
macro has this signature:
%macro odm_extractdomaindata(
_cstSourceMetadata=,
_cstSourceData=,
_cstIsReferenceData=No,
_cstSelectAttribute=Name,
_cstSelectAttributeValue=,
_cstLang=en,
_cstMaxLabelLength=256,
_cstAttachFormats=Yes,
_cstODMMinimumKeyset=No,
_cstOutputLibrary=,
_cstOutputDS=
);
Here are the parameters:
-
_cstSourceMetadata and _cstSourceData
contain the SAS libref for the SAS ODM metadata representation data.
If this is not specified,
the macro looks for type=sourcedata in SASReferences. If this is not
provided, the data set source is assumed to be in the SAS Work library.
-
_cstIsReferenceData indicates whether
the data to extract is reference data or clinical data. Examples of
reference data are laboratory reference ranges or trial design data.
-
_cstSelectAttribute contains the
ItemGroup attribute that identifies which ItemGroup to extract. Valid
values are OID, Name, SASDatasetName, and Domain.
-
_cstSelectAttributeValue contains
the value of the attribute defined by _cstSelectAttribute that identifies
the ItemGroup to extract.
-
_cstLang specifies a language identifier
for the language tag attribute (xml:lang) in the ODM TranslatedText
elements.
-
_cstMaxLabelLength determines the
maximum value of labels to be created.
If this is not provided,
256 is assumed. Formats are attached to the data set variables in
case the parameter _cstAttachFormats has a value of ‘Yes’.
-
_cstODMMinimumKeyset determines
the creation of data set keys. If this is not provided, ‘No’
is assumed.
-
_cstOutputLibrary defines the SAS
library where the extracted data sets are written.
If this is not specified,
the macro looks for type=targetdata in SASReferences. If this is not
provided, the data sets are written to the SAS Work library.
-
_cstOutputDS contains the name
of the extracted data set.
If this is an invalid
SAS data set name, an error is generated. If the data set name is
not provided, the macro looks for type=targetdata in SASReferences.
Two sample driver programs
for ODM version 1.3.0 are provided by SAS to demonstrate the use of
the odm_extractdomaindata macro:
sample study library directory/cdisc-odm-1.3.0-1.5/programs/extract_domaindata_all.sas
sample study library directory/cdisc-odm-1.3.0-1.5/programs/extract_domaindata.sas
Two sample driver programs
for ODM version 1.3.1 are provided by SAS to demonstrate the use of
the odm_extractdomaindata macro:
sample study library directory/cdisc-odm-1.3.1-1.5/programs/extract_domaindata_all.sas
sample study library directory/cdisc-odm-1.3.1-1.5/programs/extract_domaindata.sas
The extract_domaindata_all.sas
sample driver programs demonstrate how all data sets can be extracted
at once. The following shows a code fragment:
filename incCode CATALOG "work._cstCode.domains.source" lrecl=255;
data _null_;
set srcdata.itemgroupdefs(keep=OID Name IsReferenceData SASDatasetName Domain);
file incCode;
length macrocall $400 _cstOutputName $100;
_cstOutputName=SASDatasetName;
* If we have to use the Name, Only use letters and digits;
if missing(_cstOutputName) then _cstOutputName=cats(compress(Name, 'adk'));
* If first character a digit, prepend an underscore;
if anydigit(_cstOutputName)=1 then _cstOutputName=cats('_', _cstOutputName);
* Cut long names;
if length(_cstOutputName) > 32 then _cstOutputName=substr(_cstOutputName, 1, 32);
macrocall=cats('%odm_extractdomaindata(_cstSelectAttribute=OID',
', _cstSelectAttributeValue=', OID,
', _cstIsReferenceData=', IsReferenceData,
', _cstMaxLabelLength=256',
', _cstAttachFormats=Yes',
', _cstODMMinimumKeyset=No',
', _cstLang=en',
', _cstOutputDS=', _cstOutputName, ');');
put macrocall;
run;
%include incCode;
filename incCode clear;