Previous Page | Next Page

Reading, Combining, and Modifying SAS Data Sets

Error Checking When Using Indexes to Randomly Access or Update Data


The Importance of Error Checking

When reading observations with the SET statement and KEY= option or with the MODIFY statement, error checking is imperative for several reasons. The most important reason is that these tools use nonsequential access methods, and so there is no guarantee that an observation will be located that satisfies the request. Error checking enables you to direct execution to specific code paths, depending on the outcome of the I/O operation. Your program will continue execution for expected conditions and terminate execution when unexpected results occur.


Error-Checking Tools

Two tools have been created to make error checking easier when you use the MODIFY statement or the SET statement with the KEY= option to process SAS data sets:

_IORC_ is created automatically when you use the MODIFY statement or the SET statement with KEY=. The value of _IORC_ is a numeric return code that indicates the status of the I/O operation from the most recently executed MODIFY or SET statement with KEY=. Checking the value of this variable enables you to detect abnormal I/O conditions and to direct execution down specific code paths instead of having the application terminate abnormally. For example, if the KEY= variable value does match between two observations, you might want to combine them and output an observation. If they don't match, however, you might want only to write a note to the log.

Because the values of the _IORC_ automatic variable are internal and subject to change, the SYSRC macro was created to enable you to test for specific I/O conditions while protecting your code from future changes in _IORC_ values. When you use SYSRC, you can check the value of _IORC_ by specifying one of the mnemonics listed in the following table.

Most Common Mnemonic Values of _IORC_ for DATA Step Processing
Mnemonic Value Meaning of Return Code When Return Code Occurs
_DSENMR The TRANSACTION data set observation does not exist in the MASTER data set. MODIFY with BY is used and no match occurs.
_DSEMTR Multiple TRANSACTION data set observations with the same BY variable value do not exist in the MASTER data set. MODIFY with BY is used and consecutive observations with the same BY values do not find a match in the first data set. In this situation, the first observation that fails to find a match returns _DSENMR. The subsequent observations return _DSEMTR.
_DSENOM No matching observation was found in the MASTER data set. SET or MODIFY with KEY= finds no match.
_SENOCHN The output operation was unsuccessful. the KEY= option in a MODIFY statement contains duplicate values.
_SOK The I/O operation was successful. a match is found.


Example 1: Routing Execution When an Unexpected Condition Occurs


Overview

This example shows how to prevent an unexpected condition from terminating the DATA step. The goal is to update a master data set with new information from a transaction data set. This application assumes that there are no duplicate values for the common variable in either data set.

Note:   This program works as expected only if the master and transaction data sets contain no consecutive observations with the same value for the common variable. For an explanation of the behavior of MODIFY with KEY= when duplicates exist, see the MODIFY statement in SAS Language Reference: Dictionary.  [cautionend]


Input Data Sets

The TRANSACTION data set contains three observations: two updates to information in MASTER and a new observation about PartNumber value 6 that needs to be added. MASTER is indexed on PartNumber. There are no duplicate values of PartNumber in MASTER or TRANSACTION. The following shows the MASTER and the TRANSACTION input data sets:

          MASTER                            TRANSACTION

OBS   PartNumber   Quantity      OBS    PartNumber    AddQuantity

 1        1          10           1         4           14
 2        2          20           2         6           16
 3        3          30           3         2           12
 4        4          40
 5        5          50


Original Program

The objective is to update the MASTER data set with information from the TRANSACTION data set. The program reads TRANSACTION sequentially. MASTER is read directly, not sequentially, using the MODIFY statement and the KEY= option. Only observations with matching values for PartNumber, which is the KEY= variable, are read from MASTER.

data master;  1  
   set transaction;  2 
   modify master key=PartNumber;  3 
   Quantity = Quantity + AddQuantity;  4 
run;

[1] Open the MASTER data set for update.

[2] Read an observation from the TRANSACTION data set.

[3] Match observations from the MASTER data set based on the values of PartNumber.

[4] Update the information on Quantity by adding the new values from the TRANSACTION data set.


Resulting Log

This program has correctly updated one observation but it stopped when it could not find a match for PartNumber value 6. The following lines are written to the SAS log:

     ERROR: No matching observation was found in MASTER data set.
     PartNumber=6 AddQuantity=16 Quantity=70 _ERROR_=1 
     _IORC_=1230015 _N_=2
     NOTE: The SAS System stopped processing this step because 
           of errors.
     NOTE: The data set WORK.MASTER has been updated.  There were
           1 observations rewritten, 0 observations added and 0 
           observations deleted. 


Resulting Data Set

The MASTER file was incorrectly updated. The updated master has five observations. One observation was updated correctly, a new one was not added, and a second update was not made. The following shows the incorrectly updated MASTER data set:

          MASTER                                 

OBS    PartNumber    Quantity          
 1         1            10 
 2         2            20
 3         3            30
 4         4            54
 5         5            50


Revised Program

The objective is to apply two updates and one addition to MASTER, preventing the DATA step from stopping when it does not find a match in MASTER for the PartNumber value 6 in TRANSACTION. By adding error checking, this DATA step is allowed to complete normally and produce a correctly revised version of MASTER. This program uses the _IORC_ automatic variable and the SYSRC autocall macro in a SELECT group to check the value of the _IORC_ variable and execute the appropriate code based on whether a match is found.

data master; 1 
   set transaction;  2 
   modify master key=PartNumber;  3 
select(_iorc_);  4 
      when(%sysrc(_sok)) do;
         Quantity = Quantity + AddQuantity;
         replace;
      end;
      when(%sysrc(_dsenom)) do;
         Quantity = AddQuantity;
         _error_ = 0;
         output;
      end;
      otherwise do;
         put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
         put 'Program terminating. DATA step iteration # ' _n_;
         put _all_;
         stop;
      end;
   end;
run;

[1] Open the MASTER data set for update.

[2] Read an observation from the TRANSACTION data set.

[3] Match observations from the MASTER data set based on the value of PartNumber.

[4] Take the correct course of action based on whether a matching value for PartNumber is found in MASTER. Update Quantity by adding the new values from TRANSACTION. The SELECT group directs execution to the correct code. When a match occurs (_SOK), update Quantity and replace the original observation in MASTER. When there is no match (_DSENOM), set Quantity equal to the AddQuantity amount from TRANSACTION, and append a new observation. _ERROR_ is reset to 0 to prevent an error condition that would write the contents of the program data vector to the SAS log. When an unexpected condition occurs, write messages and the contents of the program data vector to the log, and stop the DATA step.


Resulting Log

The DATA step executed without error and observations were appropriately updated and added. The following lines are written to the SAS log:

     NOTE: The data set WORK.MASTER has been updated.  There were 
           2 observations rewritten, 1 observations added and 0 
           observations deleted.  


Correctly Updated MASTER Data Set

MASTER contains updated quantities for PartNumber values 2 and 4 and a new observation for PartNumber value 6. The following shows the correctly updated MASTER data set:

          MASTER                                 

OBS    PartNumber    Quantity          
 1         1            10 
 2         2            32
 3         3            30
 4         4            54
 5         5            50
 6         6            16 


Example 2: Using Error Checking on All Statements That Use KEY=


Overview

This example shows how important it is to use error checking on all statements that use the KEY= option when reading data.


Input Data Sets

The MASTER and DESCRIPTION data sets are both indexed on PartNumber. The ORDER data set contains values for all parts in a single order. Only ORDER contains the PartNumber value 8. The following shows the MASTER, ORDER, and DESCRIPTION input data sets:

           MASTER                                ORDER

  OBS    PartNumber    Quantity             OBS    PartNumber

   1         1            10                 1         2
   2         2            20                 2         4
   3         3            30                 3         1
   4         4            40                 4         3
   5         5            50                 5         8
                                             6         5
                                             7         6
              DESCRIPTION

  OBS    PartNumber    PartDescription

   1         4            Nuts
   2         3            Bolts
   3         2            Screws
   4         6            Washers


Original Program with Logic Error

The objective is to create a data set that contains the description and number in stock for each part in a single order, except for the parts that are not found in either of the two input data sets, MASTER and DESCRIPTION. A transaction data set contains the part numbers of all parts in a single order. One data set is read to retrieve the description of the part and another is read to retrieve the quantity that is in stock.

The program reads the ORDER data set sequentially and then uses SET with the KEY= option to read the MASTER and DESCRIPTION data sets directly, based on the key value of PartNumber. When a match occurs, an observation is written that contains all the necessary information for each value of PartNumber in ORDER. This first attempt at a solution uses error checking for only one of the two SET statements that use KEY= to read a data set.

data combine; 1 
   length PartDescription $ 15; 
   set order;  2 
   set description key=PartNumber;  2 
   set master key=PartNumber;  2 
   select(_iorc_); 3 
         when(%sysrc(_sok)) do;
            output;
         end;
         when(%sysrc(_dsenom)) do;
            PartDescription = 'No description';
            _error_ = 0;
            output;
         end;
         otherwise do;
            put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
            put 'Program terminating.';
            put _all_;
            stop;
         end;
   end;
run;

[1] Create the COMBINE data set.

[2] Read an observation from the ORDER data set. Read an observation from the DESCRIPTION and the MASTER data sets based on a matching value for PartNumber, the key variable. Note that no error checking occurs after an observation is read from DESCRIPTION.

[3] Take the correct course of action, based on whether a matching value for PartNumber is found in MASTER or DESCRIPTION. (This logic is based on the erroneous assumption that this SELECT group performs error checking for both of the preceding SET statements that contain the KEY= option. It actually performs error checking for only the most recent one.) The SELECT group directs execution to the correct code. When a match occurs (_SOK), the value of PartNumber in the observation that is being read from MASTER matches the current PartNumber value from ORDER. So, output an observation. When there is no match (_DSENOM), no observations in MASTER contain the current value of PartNumber, so set the value of PartDescription appropriately and output an observation. _ERROR_ is reset to 0 to prevent an error condition that would write the contents of the program data vector to the SAS log. When an unexpected condition occurs, write messages and the contents of the program data vector to the log, and stop the DATA step.


Resulting Log

This program creates an output data set but executes with one error. The following lines are written to the SAS log:

     PartNumber=1 PartDescription=Nuts Quantity=10 _ERROR_=1 
     _IORC_=0 _N_=3
     PartNumber=5 PartDescription=No description Quantity=50 
     _ERROR_=1 _IORC_=0 _N_=6
     NOTE: The data set WORK.COMBINE has 7 observations and 3 variables.


Resulting Data Set

The following shows the incorrectly created COMBINE data set. Observation 5 should not be in this data set. PartNumber value 8 does not exist in either MASTER or DESCRIPTION, so no Quantity should be listed for it. Also, observations 3 and 7 contain descriptions from observations 2 and 6, respectively.

                COMBINE              

OBS    PartNumber    PartDescription   Quantity            
 1         2          Screws              20  
 2         4          Nuts                40 
 3         1          Nuts                10 
 4         3          Bolts               30 
 5         8          No description      30 
 6         5          No description      50 
 7         6          No description      50 


Revised Program

To create an accurate output data set, this example performs error checking on both SET statements that use the KEY= option:

data combine(drop=Foundes); 1 
   length PartDescription $ 15;
   set order;  2 
   Foundes = 0;  3 
   set description key=PartNumber;  4  
   select(_iorc_);  5 
      when(%sysrc(_sok)) do;
         Foundes = 1;
      end;
      when(%sysrc(_dsenom)) do;
         PartDescription = 'No description';
         _error_ = 0;
      end;
      otherwise do;
         put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
         put 'Program terminating. Data set accessed is DESCRIPTION';
         put _all_;
         _error_ = 0;
         stop;
      end;
   end;
set master key=PartNumber;  6 
select(_iorc_); 7 
      when(%sysrc(_sok)) do;
         output;
      end;
      when(%sysrc(_dsenom)) do;
         if not Foundes then do;
            _error_ = 0;
            put 'WARNING: PartNumber ' PartNumber 'is not in'
                ' DESCRIPTION or MASTER.';
         end;
         else do;
            Quantity = 0;
            _error_ = 0;
            output;
         end;
      end;
      otherwise do;
         put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
         put 'Program terminating. Data set accessed is MASTER';
         put _all_;
         _error_ = 0;
         stop;
      end;
   end;      /* ends the SELECT group */ 
run;

[1] Create the COMBINE data set.

[2] Read an observation from the ORDER data set.

[3] Create the variable Foundes so that its value can be used later to indicate when a PartNumber value has a match in the DESCRIPTION data set.

[4] Read an observation from the DESCRIPTION data set, using PartNumber as the key variable.

[5] Take the correct course of action based on whether a matching value for PartNumber is found in DESCRIPTION. The SELECT group directs execution to the correct code based on the value of _IORC_. When a match occurs (_SOK), the value of PartNumber in the observation that is being read from DESCRIPTION matches the current value from ORDER. Foundes is set to 1 to indicate that DESCRIPTION contributed to the current observation. When there is no match (_DSENOM), no observations in DESCRIPTION contain the current value of PartNumber, so the description is set appropriately. _ERROR_ is reset to 0 to prevent an error condition that would write the contents of the program data vector to the SAS log. Any other _IORC_ value indicates that an unexpected condition has been met, so messages are written to the log and the DATA step is stopped.

[6] Read an observation from the MASTER data set, using PartNumber as a key variable.

[7] Take the correct course of action based on whether a matching value for PartNumber is found in MASTER. When a match is found (_SOK) between the current PartNumber value from ORDER and from MASTER, write an observation. When a match isn't found (_DSENOM) in MASTER, test the value of Foundes. If Foundes is not true, then a value was not found in DESCRIPTION either, so write a message to the log but do not write an observation. If Foundes is true, however, the value is in DESCRIPTION but not MASTER. So write an observation but set Quantity to 0. Again, if an unexpected condition occurs, write a message and stop the DATA step.


Resulting Log

The DATA step executed without error. Six observations were correctly created and the following message was written to the log:

     WARNING: PartNumber 8 is not in DESCRIPTION or MASTER.
     NOTE: The data set WORK.COMBINE has 6 observations 
           and 3 variables.


Correctly Created COMBINE Data Set

The following shows the correctly updated COMBINE data set. Note that COMBINE does not contain an observation with the PartNumber value 8. This value does not occur in either MASTER or DESCRIPTION.

                COMBINE            

OBS    PartNumber    PartDescription   Quantity          

 1         2         Screws              20 
 2         4         Nuts                40
 3         1         No description      10
 4         3         Bolts               30
 5         5         No description      50
 6         6         Washers              0

Previous Page | Next Page | Top of Page