Using Simple Debugging Techniques

Problem

Occasionally a process flow might run longer than you expect or the data that is produced might not be what you anticipate (either too many records or too few). In such cases, it is important to understand how a process flow works. Then, you can correct errors in the flow or improve its performance.

Solution

A first step in analyzing process flows is being able to access information from SAS that will explain what happened during the run. If there were errors, you need to understand what happened before the errors occurred. If you are having performance issues, then the logs identify which steps are performing poorly. Finally, if you know what SAS options are set and how they are set, this information can help you determine what is going on in your process flows. You can perform the following tasks:

Tasks

Check the Status of a Job

You can see information about the status of your jobs and the nodes that they contain. This status information is provided by the following features:
  • the status indicators and sticky note windows on the nodes on the Diagram tab of the Job Editor window. These features are available before and after you submit a job. Therefore, they are useful as tools that help you construct a job and determine whether it is ready to run.
  • the Status tab on the Details pane of the Job Editor window. This feature displays the status of each node in a job as it is run. You can double-click an error or warning status on a node to display it in the Warnings and Errors tab.
  • the Warnings and Errors tab on the Details pane of the Job Editor window. This feature displays any warnings or errors that are displayed as a job is run. You can click the link in an error or warning to see it displayed in the Log tab of the Job Editor window.
For information about using these features, see Reviewing a Successful Job and Diagnosing and Correcting an Unsuccessful Job.

Verify Output From a Transformation

You can view the output tables for the transformations in the job. Reviewing the output tables enables you to verify that each transformation is creating the expected output. This review can be useful when a job is not producing the expected output or when you suspect that something is wrong with a particular transformation in the job. For more information, see Browsing Table Data.

Limit Input to a Transformation

When you are debugging and working with large data files, you might find it useful to decrease some or all of the data that is flowing into a particular step or steps. One way of doing this is to use the OBS= data set option on input tables of DATA steps and procedures.
To specify the OBS= system option for an entire job in SAS Data Integration Studio, add the following code to the Precode and Postcode tab in the job's property window:
 options
obs=<number>;
To specify the OBS= system option for a transformation within a job, you can temporarily add the option to the System options field on the Options tab in the transformation's property window. Alternatively, you can edit the code that is generated for the transformation and execute the edited code. For more information about this method, see Specifying Options for Jobs.
Important considerations when you are using the OBS= system option include the following:
  • All inputs into all subsequent steps are limited to the specified number, until the option is reset.
  • Setting the number too low before a join or merge step can result in few or no matches, depending on the data.
  • In the SAS Data Integration Studio Job Editor, this option stays in effect for all runs of the job until it is reset or the Job Editor window is closed.
The syntax for resetting the option is as follows:
options
obs=MAX;
Note: Removing the OBS= line of code from the Job Editor does not reset the OBS= system option. You must reset it as shown or by closing the Job Editor window.
The Max Input Rows option enables you to specify the number of input rows to an SQL query within the Designer window of the SQL join transformation. To access this option, click SQL Join in the Navigate pane of the window. Then, look for the option in the SQL Join Properties pane. You can also specify the number of output rows with the Max Output Rows option.

Add Debugging Code to a Process Flow

If you are analyzing a SAS Data Integration Studio job, and the information that is provided by logging options and status codes is not enough, consider the following methods for adding debugging code to the process flow.
Methods for Adding Custom Debugging Code
Method
Documentation
Replace the generated code for a transformation with user-written code.
Add the User-Written Code transformation to the process flow.
Add a generated transformation to the process flow.
Add a return code to the process flow.
Custom code can direct information to the log or to alternate destinations such as external files, or tables. Possible uses include tests of frequency counts, dumping out SAS macro variable settings, or listing the run-time values of system options.

Set SAS Invocation Options on Jobs

When you submit a SAS Data Integration Studio job for execution, it is submitted to a SAS Workspace Server component of the relevant SAS Application Server. The relevant SAS Application Server is one of the following:
  • the default server that is specified on the SAS Server tab in the Options window
  • the SAS Application Server to which a job is deployed
To set SAS invocation options for all SAS Data Integration Studio jobs that are executed by a particular SAS server, specify the options in the configuration files for the relevant SAS Workspace Servers, batch or scheduling servers, and grid servers. (You do not set these options on SAS Metadata Servers or SAS Stored Process Servers.) Examples of these options include UTILLOC, NOWORKINIT, or ETLS_DEBUG. For more information, see Modifying Configuration Files or SAS Start Commands for Application Servers.
To set SAS global options for a particular job or transformation within a job, you can add these options to the Precode and Postcode tab in the properties window. For more information about adding code to this window, see Specifying Options for Jobs.
The property window for most transformations within a job has an Options tab with a System Options field. Use the System Options field to specify options for a particular transformation in a job's process flow. For more information, see Specifying Options for a Transformation.
For more information about SAS options, search for relevant phrases such as “system options” and “invoking SAS” in SAS OnlineDoc.

Set and Check Status Codes

When you execute a job in SAS Data Integration Studio, a return code for each transformation in the job is captured in a macro variable. The return code for the job is set according to the least successful transformation in the job. SAS Data Integration Studio enables you to associate a return code condition, such as Successful, with an action, such as Send Email or Abort. In this way, users can specify how a return code is handled for the job or transformation.
For example, you could specify that a transformation in a process flow will terminate based on conditions that you define. The log can be defined to display only the transformations that affect the problem being investigated, making the log more manageable and eliminating inconsequential error messages. For more information about status code handling for transformations, see Perform Actions Based on the Status of a Transformation.
You should also remember that the status code information is supplemented by the job and node status information in the Job Editor window, particularly the Status tab and Warnings and Errors tab in the Details pane. For more information, see Check the Status of a Job.