Special Topic: Debugging a Validation Process

Overview

The SAS Clinical Standards Toolkit provides two properties or global macro variables for debugging problems occurring with all processes. These are _cstDebug and _cstDebugOptions.
The _cstDebug global macro variable toggles debugging options on and off. Many SAS Clinical Standards Toolkit code modules have conditional branching such as:
%if &_cstDebug %then
%do;
    /* perform some action */
end;
If debugging is toggled on (_cstDebug=1), several things can happen.
  • If code is in place, like this excerpt from the sample driver program (validate_data.sas for SDTM 3.1.3) documented in Running a Validation Process, additional messaging to the SAS log can be enabled.
    %let _cstDebug=0;
    data _null_;
      _cstDebug = input(symget('_cstDebug'),8.);
      if _cstDebug then
        call execute("options &_cstDebugOptions;");
      else
        call execute(("%sysfunc(tranwrd(options %cmpres(&_cstDebugOptions), 
                      %str( ), %str( no)));"));
    run;
    By default, the &_cstDebugOptions global macro variable is set to:
    mprint mlogic symbolgen mautolocdisplay
    These SAS global macro variables generate a lot of information, and they quickly fill the SAS log when running interactively. To increase the default log size permitted, use the option DMSLOGSIZE . You might consider running the process in batch or use PROC PRINTTO to redirect the SAS log to a file.
  • Many Work files created during the process are not deleted. They remain available in the Work library to help with debugging.
Each SAS Clinical Standards Toolkit process consists of two primary tasks. The first task is to use set up routines to establish the SAS Clinical Standards Toolkit environment. The second task is to perform some primary SAS Clinical Standards Toolkit action. Your debugging focus is different for these two tasks.

Errors in Setting Up the SAS Clinical Standards Toolkit Environment

In the SAS Clinical Standards Toolkit environment setup, errors most often occur because of problems with the SASReferences data set. For recommendations on configuring the SASReferences data set appropriately, see Building a SASReferences File.
The following table lists common setup errors and possible causes:
Debugging Process Setup Errors
Error
Location Where Error Is Reported
Possible Cause and Corrective Action
Expected libraries are not allocated.
SAS Log, Libraries window, SAS DMS
(1) An invalid physical name for the libref has been used.
Is the libref a valid SAS name?
A SAS name can contain one to 32 characters.
It must start with a letter or an underscore (_), not a number.
Subsequent characters must be letters, numbers, or underscores.
Blanks cannot appear in SAS names.
Is the libref a reserved SAS libref name? You should not use Work, Sasuser, or Sashelp.
(2) The path specified for the libref is invalid; it points to a nonexistent directory. Check the path in your SASReferences data set.
Error: SAS system library WORK cannot be reassigned.
SAS Log
Work is being used as a sasref value with or without a path being designated. A similar error occurs if Sasuser or Sashelp is used.
WARNING: One or more libraries specified in the concatenated library CSTTMP do not exist.
SAS Log
One of the paths specified for a libref is invalid; it points to a nonexistent directory.
Warning: Process ending prematurely for CST0090-there were problems with the SASReferences data set.
SAS Log
There is a problem with the SASReferences data set being used. Check for these potential problems:
The SASReferences data set does not exist.
The SASReferences data set exists but it is empty.
The structure of the SASReferences data set is incorrect. For example, it might have an extra column that is not required or an expected column that is missing.
A column type might be incorrect. For example, the Order column might be character instead of numeric.
An invalid TYPE or SUBTYPE or combination is used in the SASReferences data set. Valid TYPE and SUBTYPE values are provided in the Standardlookup data set found in global standards library directory/metadata.
A TYPE value is missing.
A SASREF value is missing or invalid.
A REFTYPE value is missing or is not equal to libref or fileref (case insensitive).
Error: Physical file does not exist.
SAS Log
(1) The SASReferences data set references a file that does not exist.
(2) The filename is not a valid SAS name.
WARNING: Apparent invocation of macro SDTM_VALIDATE not resolved.
SAS Log
(1) The macro is misnamed or has not been added to the expected autocall library.
Does the macros folder for this standard exist in the cstGlobalLibrary, in the !sasroot hierarchy, or in some correctly designated custom location?
(2) The expected autocall path was not created correctly in the call to %CSTUTIL_ALLOCATESASREFERENCES.
Check that the SASReferences data set contains a type=autocall record, defined as a fileref, and points to the correct folder location.
Check for an error occurring earlier in the SAS log suggesting that cstutil_allocatesasreferences failed before setting the autocall path.

Errors in Performing Some Primary SAS Clinical Standards Toolkit Action

If the task to perform the primary SAS Clinical Standards Toolkit action begins (that is, the standard-specific validation macro, such as %SDTM_VALIDATE or %CRTDDS_VALIDATE, is found and begins processing), then setup has completed successfully. The remaining process failures are likely because of problems with the various validation components.
Most errors that halt a validation process are reported in the Results data set. As a general rule, these Results data set fields signal process failures and provide information about the cause of the failure:
  • the Process status field (_cst_rc), when the value is set to a nonzero value
  • the Problem detected field (resultflag), when the value is set to -1
  • the Source Data field (srcdata) identifies the macro reporting the problem
  • the Resolved Message text field (message) provides a problem cause
  • the Basis for Result field (resultdetails) can provide additional information pertinent to the problem
Depending on the severity of the problem and when it occurs, the Results data set might not be saved to the persisted location if that location was requested using a type=results record in the SASReferences data set. In this case, the Results data set defined with the &_cstResultsDS global macro variable might be referenced for the previous information. By default, &_cstResultsDS is set to work._cstresults.
Generally, the SAS Clinical Standards Toolkit does not halt the validation process when an error is detected in a specific check. The error is noted in the Results data set, the resultflag value for that check is set to -1, _cst_rc is set to 0, and processing continues with the next check. A validation process is most likely to be halted (by setting _cst_rc to 1) when there is a significant metadata error that suggests subsequent checks would likely fail to run.
The following table lists common causes for premature process failure or the failure of specific checks to run:
Debugging Validation Process Errors
Error
Resultid in Results Data Set
Possible Cause or Corrective Action
No tables evaluated-check validation control data set.
CST0002
No tables interpreted from the tablescope value could be found in the work._csttablemetadata data set.
<Data set> could not be found
CST0003
This error usually indicates that a specific source column or data set could not be found. The code loops through a set of domains or columns built from the source metadata data sets. This error might result when the source metadata does not accurately reflect the source data.
No columns evaluated-check Validation Control specification.
CST0004
No columns interpreted from the columnscope value could be found in the work._cstcolumnmetadata data set.
The SAS Clinical Standards Toolkit looks at the union of both tablescope and columnscope to build work._cstcolumnmetadata. The specified column might exist in a domain, but not in any column specified in a tablescope domain.
Lookup to SASReferences control data set failed.
CST0006
The SAS Clinical Standards Toolkit code has a call to the %CSTUTIL_GETSASREFERENCE utility macro for a type or type and subtype combination that cannot be found in the SASReferences data set. This indicates that SASReferences has been incompletely defined for the SAS Clinical Standards Toolkit validation process.
Validation Control parsing of tablescope/column results in inconsistent sublist lengths.
CST0023
This check involves a comparison of tables or columns, as indicated by multiple sets of brackets in tablescope or columnscope. Each set of brackets constitutes a sublist. However, the number of items in the specified sublist is inconsistent or unexpected by the check macro. Options typically include a more accurate specification of sublist items, either using explicit table or column names or more restrictive tablescope syntax (that is, removing the domain causing the inconsistency using minus sign (-) syntax, such as _ALL_-DM).
One or more check metadata column values is invalid.
CST0026
A value in the Validation Control data set for the check being run is invalid in the context of the specific check macro. Examples include conditions that are required by the check macro but are not found, such as no code logic found, an unexpected usesourcemetadata value, or no lookuptype or lookupsource for valid value assessments.
Code failed due to SAS error-see log.
CST0050
A SAS DATA step or SAS procedure failed and the cause is reported in the SAS log. This most commonly occurs because of missing data sets, missing columns, incorrectly sorted data sets, and unexpected macro variable values.
<Message lookup failed to find matching record>
<varies>
The check macro code generates a resultid value that does not find a match in the Messages data set. Either the wrong resultid has been specified, or the standard-specific Messages data set has not been updated to include the resultid.

Other Debugging Tips

Here are some debugging tips that you might find useful:
  • Review available Work files for information about the errors (for example, _cstresults, _csttablemetadata, and _cstcolumnmetadata). These files might remain in the Work directory after a process by default. Toggling the _cstDebug global macro variable to 1 forces the Work files to remain after the process ends.
  • When debugging, avoid setting the parameter flags in cstutil_cleanupcstsession to 1 (if that cleanup macro is called).
    %cstutil_cleanupcstsession(_cstClearCompiledMacros=0,
    _cstClearLibRefs=0, _cstResetSASAutos=0, _cstResetFmtSearch=0,
    _cstResetSASOptions=0,_cstDeleteFiles=0,_cstDeleteGlobalMacroVars=0);
    
  • Use work._cstcolumnmetadata and work._csttablemetadata to resolve missing domain and column issues. These data sets can also be used to resolve sublist length differences for checks using sublist syntax [] in tablescope and columnscope.
  • Use the resultid code (for example, CST0003) in the Results data set to search the check macro code module used for a specific check for information about the error. The name of the macro code module is set in the Validation Control codesource field.