IMS DATA Step Examples

Overview of IMS DATA Step Examples

Complete IMS DATA step examples are presented in this section. Each example illustrates one or more of the concepts described earlier in this section.
All of these examples are based on the sample databases, DBDs, and PSBs described in Example Data. If you have not read the sample database descriptions, you should do so before continuing this section.
It is assumed that the installation default values for IMS DATA step system options are the same as the default values described in Statement options used in these examples that are not IMS DATA step statement extensions (for example, the HEADER= option in the FILE statement) are described in SAS Statements: Reference.

Example 6: Issuing Path Calls

This example produces a report that shows the distribution of checking account balances by ZIP code in the AcctDBD database. SAS data set DistribC is created from data in the CUSTOMER and CHCKACCT segments. The segments are retrieved with get-next calls using an unqualified SSA for the CUSTOMER segment with an *D command code and an SSA for the CHCKACCT segment. Thus, both the CUSTOMER and CHCKACCT segments are returned. The new SAS data set contains three variables: CHECK_AMOUNT (from the CHCKACCT segment), ZIPRANGE (created from the CUSTZIP value in the CUSTOMER segment), and BALRANGE (created from the BALANCE variable). The distribution information is produced by the TABULATE procedure from the DistribC data set.
The numbered comments following this program correspond to the numbered statements in the program:
1  data distribc;
2     length  $11;
3     keep ziprange
            check_amount
            balrange;
4     retain ssa1 'CUSTOMER*D '
              ssa2 'CHCKACCT ';
5     infile acctsam dli ssa=(ssa1,ssa2) status=st 
            pcbno=3;
6     input @162 zip_code      $char10.
             @238 check_amount  pd5.2;
7     if st ¬= '  ' and
          st ¬= 'CC' and
          st ¬= 'GA' and
          st ¬= 'GK' then
8       if st = 'GE' then
            do;
               _error_ = 0;
               stop;
            end;
9     else
        do;
           file log;
           put _all_;
10         abort;
        end;
11     balrange=check_amount;
12     ziprange=substr(zip_code,1,4)
         ||'0-'||substr(zip_code,1,4)||'9';
       title 'Checking Account Balance Distribution 
              By ZIP Code';
13  proc format;
      value balrang
      low-249.99 = 'under $250'
      250.00-1000.00 = '$250 - $1000'
      1000.01-high = 'over $1000';
14  proc tabulate data=distribc;
      class ziprange balrange;
      var check_amount;
      label balrange='balance range';
      label ziprange='ZIP code range';
      format ziprange $char11. balrange balrange.;
      keylabel sum= '$ total' mean ='$ average' 
        n='# of accounts';
      table ziprange*(balrange all),
            check_amount*(sum*f=14.2 mean*f=10.2 n*f=4);
   run;
1 The DATA statement specifies DistribC as the name of the SAS data set created by this DATA step.
2 The length of the new variable ZIPRANGE is set.
3 The new data set contains only the three variables specified in the KEEP statement.
4 The RETAIN statement specifies values for the two SSA variables, SSA1 and SSA2. SSA1 is an unqualified SSA for the CUSTOMER segment with the command code for a path call, *D. This command code means that the CUSTOMER segment is returned along with the CHCKACCT segment that is its child. SSA2 is an unqualified SSA for the CHCKACCT segment. Without the *D command code in SSA1, only the target segment, CHCKACCT, would be returned.
These values are retained for each iteration of the DATA step. The RETAIN statement, which initializes the variables, satisfies the requirement that the length of an SSA variable be specified before the DL/I INFILE statement is executed.
5 The INFILE statement specifies ACCTSAM as the PSB. The DLI specification tells SAS that the step accesses DL/I resources. Two variables containing SSAs are identified by the SSA= option, SSA1 and SSA2. Their values were set by the earlier RETAIN statement. The STATUS= option specifies the ST variable for status codes returned by DL/I. The PCBNO= option specifies which PCB to use.
These defaults are in effect for the other DL/I INFILE options: all calls are get-next calls, the input buffer has a length of 1000 bytes, and the segment, and PCB mask data are not returned. No qualified SSAs are used. Therefore, program access is sequential.
6 The DL/I INPUT statement specifies positions and informats for the necessary variables in both the CUSTOMER and CHCKACCT segments because the path call returns both segments. When this statement executes, the GN call is issued. If successful, CUSTOMER and CHCKACCT segments are placed in the input buffer and the ZIP_CODE and CHECK_AMOUNT values are then moved to SAS variables in the program data vector.
7 If the qualified GN call issued by the DL/I INPUT statement is not successful (that is, it obtains any return code other than blank, CC, GA, or GK), the automatic SAS variable _ERROR_ is set to 1 and the DO group (statements 8 through 10) is executed.
8 If the ST variable value is GE (a status code meaning that the segment or segments were not found), SAS stops execution of the DATA step. _ERROR_ is reset to 0 so that the contents of the input buffer and program data vector are not printed on the SAS log. This statement is included because of a DL/I feature. In a program issuing path calls, DL/I sometimes returns a GE status code when it reaches end-of-database. The GB (end-of-database) code is returned if another get call is issued after the GE code. Therefore, in this program, the GE code can be considered the end-of-file signal rather than an error condition.
9 For any other non-blank status code, all values from the program data vector are written to the SAS log.
10 The DATA step execution terminates and the job ends.
11 If the qualified GN call is successful, BALRANGE is assigned the value of CHECK_AMOUNT.
12 The ZIPRANGE variable is created using the SUBSTR function with the ZIP_CODE variable.
13 PROC FORMAT is invoked to create a format for the BALRANGE variable. These formats are used in the PROC TABULATE output.
14 PROC TABULATE is invoked to process the DistribC data set.
The following output shows the results of this example.
Results of Issuing Path Calls
                Checking Account Balance Distribution By ZIP code               
                         

      ----------------------------------------------------------------
      |                               |         check_amount         |
      |                               |------------------------------|
      |                               |              |          |# of|
      |                               |              |          |acc-|
      |                               |              |          |oun-|
      |                               |   $ total    |$ average | ts |
      |-------------------------------+--------------+----------+----|
      |ZIP code range |balance range  |              |          |    |
      |---------------+---------------|              |          |    |
      |22210-22219    |over $1000     |       4410.50|   1470.17|   3|
      |               |---------------+--------------+----------+----|
      |               |All            |       4410.50|   1470.17|   3|
      |---------------+---------------+--------------+----------+----|
      |25800-25809    |balance range  |              |          |    |
      |               |---------------|              |          |    |
      |               |over $1000     |       8705.76|   4352.88|   2|
      |               |---------------+--------------+----------+----|
      |               |All            |       8705.76|   4352.88|   2|
      |---------------+---------------+--------------+----------+----|
      |26000-26009    |balance range  |              |          |    |
      |               |---------------|              |          |    |
      |               |under $250     |        220.11|    110.06|   2|
      |               |---------------+--------------+----------+----|
      |               |$250 - $1000   |        826.05|    826.05|   1|
      |               |---------------+--------------+----------+----|
      |               |over $1000     |       2392.93|   2392.93|   1|
      |               |---------------+--------------+----------+----|
      |               |All            |       3439.09|    859.77|   4|
      |---------------+---------------+--------------+----------+----|
      |26040-26049    |balance range  |              |          |    |
      |               |---------------|              |          |    |
      |               |$250 - $1000   |        353.65|    353.65|   1|
      |               |---------------+--------------+----------+----|
      |               |All            |        353.65|    353.65|   1|
      |---------------+---------------+--------------+----------+----|
      |26500-26509    |balance range  |              |          |    |
      |               |---------------|              |          |    |
      |               |$250 - $1000   |       1280.56|    640.28|   2|
      |               |---------------+--------------+----------+----|
      |               |All            |       1280.56|    640.28|   2|
     
      ----------------------------------------------------------------

Example 7: Updating Information in the CUSTOMER Segment

This example uses GHN calls to retrieve CUSTOMER segments and then tests the values of the STATE and COUNTRY fields. If a segment has a valid value for STATE but does not have COUNTRY='UNITED STATES', the COUNTRY value is changed to UNITED STATES and the corrected segment is replaced using a REPL call.
Follow the notes corresponding to the numbered statements in the following code for a detailed explanation of this example:
filename tranrept '<your.sas.tranrept>' disp=old;
data _null_;
1 length ssa1 $ 9;
2 infile acctsam dli ssa=ssa1 call=func pcbno=4 
       status=st;
3 func = 'GHN ';
4 ssa1 = 'CUSTOMER';
5 input @12   customer_name  $char40.
         @140  state         $char2.
         @142  country       $char20.;
6 if st ¬= '  ' and
      st ¬= 'CC' and
      st ¬= 'GA' and
      st ¬= 'GK' then
     link abendit;
7 if country ¬= 'UNITED STATES' &
      state < 'Z ' &
      state > 'A ' then
     do;
8      oldland = country;
9      country = 'UNITED STATES';
10      file acctsam dli;
11      func = 'REPL';
12      ssa1 = '  ';
13      put @1   _infile_
            @142 country;
14      if st ¬= '  ' then
          link abendit;
15      file tranrept header=newpage notitles;
16      put @10 customer_name
            @60 state
            @65 oldland;
17     end;
18   return;
19   newpage: put / @15
           'Customers Whose Country was Changed to 
               UNITED STATES'
           // @17 'Name' @58 'State' @65 'old
Country';
20   return;

    abendit:
      file log;
      put _all_;
      abort;
run;
filename tranrept clear;
1 The length of SSA1, an SSA variable specified in the INFILE statement, is set before execution of the DL/I INFILE statement, as required.
2 The INFILE statement specifies ACCTSAM as the PSB, and the DLI specification tells SAS that this step accesses DL/I resources. The SSA= option identifies SSA1 as a variable that contains a Segment Search Argument. (The length of SSA1 was established by the LENGTH statement.) The CALL= option specifies FUNC as the variable containing DL/I call functions, and STATUS is used to return the status code. The value of PCBNO is used to select the appropriate PCB for this program. This value is carried over in successive executions of the DATA step.
These defaults are in effect for other DL/I INFILE options: the input and output buffers are 1000 bytes in length, and segment names and PCB mask data are not returned. Program access is sequential.
3 The FUNC variable is assigned a value of GHN, so the next DL/I INPUT statement issues a get-hold-next call.
4 The SSA1 variable is assigned a value of CUSTOMER. The GHN call is qualified to retrieve a CUSTOMER segment.
5 The DL/I INPUT statement specifies positions and informats for some of the fields in the CUSTOMER segment. When this statement executes, a qualified GHN call is issued. If the call is successful, a CUSTOMER segment is retrieved and placed in the input buffer. Since variables are named in the INPUT statement, the segment data is moved to SAS variables in the program data vector.
6 When a call is not successful (that is, when the DL/I status code is something other than blank, CC, GA, or GK), the automatic SAS variable _ERROR_ is set to 1. If the status code is set to GB (indicating end of database), and if the DATA step is processing sequentially (as this one is), the DATA step is stopped automatically with an end-of-file return code sent to SAS.
7 If the call is successful, the values of COUNTRY and STATE are checked. If COUNTRY is not UNITED STATES, and the STATE value is alphabetic, a DO group (statements 8 through 17) executes.
8 The value of COUNTRY is assigned to a new variable called OLDLAND.
9 COUNTRY's value is changed to UNITED STATES.
10 A DL/I FILE statement indicates that an update call is to be issued. Notice that the FILE statement specifies the same PSB named in the DL/I INFILE statement, as required.
11 The value of FUNC is changed from GHN to REPL. If the FUNC value is not changed, an update call cannot be issued.
12 The value of SSA1 is changed from CUSTOMER to blanks. Since the REPL call uses the segment retrieved by the GHN call, an SSA is not needed.
13 The DL/I PUT statement formats the CUSTOMER segment in the output buffer and issues the REPL call. The entire segment must be formatted, even though the value of only one field, COUNTRY, is changed.
14 If the REPL call is not successful (that is, the status code from DL/I was not blank), all values from the program data vector are written to the SAS log and the DATA step aborts.
15 If the REPL call is successful, the step goes on to execute another FILE statement. This is not a DL/I FILE statement. Instead, it specifies the fileref (TRANREPT) of an output file for a printed report on the replaced segments. The HEADER= option points to the NEWPAGE subroutine. Each time a new page of the update report is started, SAS links to NEWPAGE and executes the statement.
16 The PUT statement specifies variables and positions to be written to the TRANREPT output file.
17 The DO group is terminated by the END statement.
18 Execution returns to the beginning of the DATA step when this RETURN statement executes.
19 This PUT statement executes when a new page starts in the output file TRANREPT. The HEADER= option in the FILE TRANREPT statement points to the NEWPAGE label, so when a new page begins, SAS links to this labeled statement and prints the specified heading.
20 After printing the heading, SAS returns to the PUT statement immediately after the FILE TRANREPT statement (item 16) and continues execution of the step.

Example 8: Using the Blank INPUT Statement

This program calculates customer balances by retrieving a CUSTOMER segment and then all CHCKACCT and SAVEACCT segments for that customer record. The CUSTOMER segments are retrieved by qualified get-next calls, and the CHCKACCT and SAVEACCT segments are retrieved by qualified get-next-within-parent calls. A GE or GB status when retrieving the CHCKACCT and SAVEACCT segments indicates that there are no more of that segment type for the current parent segment (CUSTOMER).
The numbered comments following this program correspond to the numbered statements in the program:
1 data balances;
2   length ssa1 $9;
3   keep soc_sec_number
          chck_bal
          save_bal;
4   chck_bal = 0;
     save_bal = 0;
5   infile acctsam dli pcbno=4 call=func ssa=ssa1
     status=st;
6   func = 'GN  ';
7   ssa1 = 'CUSTOMER ';
8   input @;
9   if st ¬= '  ' and
       st ¬= 'CC' and
       st ¬= 'GA' and
       st ¬= 'GK' then
      link abendit;
10  input @1 soc_sec_number $char11.;
11  st = '  ';
12  func = 'GNP ';
13  ssa1 = 'CHCKACCT ';
14  do while (st = '  ');
15    input @;
16    if st = '  ' then
        do;
17          input @13 check_amount pd5.2;
18          chck_bal=chck_bal + check_amount;
19      end;
20   end;
21
   if st ¬= 'GE' then
      link abendit;
22   st = '  ';
23   _error_ = 0;
24   input;
25   ssa1 = 'SAVEACCT ';
26   do while (st = '  ');
        input @;
        if st = '  ' then
           do;
              input @13 savings_amount pd5.2;
              save_bal = save_bal + savings_amount;
          end;
     end;

     if st = 'GE' then
        _error_ = 0;
     else
        link abendit;
     return;
27 abendit:
     file log;
     put _all_;
     abort;
  run;
28  proc print data=balances;
     title2 'Customer Balances';
  run;
1 The DATA step creates a new SAS data set called Balances.
2 The length of SSA1, an SSA variable specified in the INFILE statement, is set before execution of the DL/I INFILE statement, as required.
3 The KEEP statement tells SAS that the variables SOC_SEC_NUMBER, CHCK_BAL, and SAVE_BAL are the only variables to be included in the Balances data set.
4 The CHCK_BAL and SAVE_BAL variables are assigned an initial value of 0 and are reset to 0 for each new customer.
5 The INFILE statement specifies ACCTSAM as the PSB, and the DLI specification tells SAS that this step accesses DL/I resources. The SSA= option identifies SSA1 as a variable that contains an SSA. (The length of SSA1 was established by the LENGTH statement.) The CALL= option specifies FUNC as the variable containing DL/I call functions, and the PCBNO= option specifies which database PCB should be used.
These defaults are in effect for the other DL/I INFILE statement options: the input buffer is 1000 bytes in length, and segment names and PCB mask data are not returned. There are no qualified SSAs in the program, so access is sequential.
6 The FUNC variable is assigned a value of GN, so the next DL/I INPUT statement issues a get-next call.
7 The SSA1 variable is assigned a value of CUSTOMER, so the GN call retrieves the CUSTOMER segment.
8 The only specification in the DL/I INPUT statement is the trailing @ sign. When the statement executes, the GN call is issued and, if the call is successful, a CUSTOMER segment is retrieved and placed in the input buffer. Since no variables are named in the INPUT statement, the segment data is not moved to SAS variables in the program data vector. Instead, the segment is held in the input buffer for the next DL/I INPUT statement that executes (that is, the next DL/I INPUT statement does not issue a call but uses the data already in the buffer).
9 When a call is not successful (that is, when the DL/I status code is something other than blank, CC, GA, or GK), the automatic SAS variable _ERROR_ is set to 1. If the status code is set to GB (indicating end of database) and if the DATA step is processing sequentially (as this one is), the DATA step is stopped automatically with an end-of-file return code sent to SAS.
10 If the call is successful, this DL/I INPUT statement executes. It moves the SOC_SEC_NUMBER value from the input buffer (where the segment was placed by the previous DL/I INPUT statement) to a SAS variable in the program data vector.
11 The value of the ST variable for status codes is reset to blanks.
12 The value of the FUNC variable is reset to GNP. The next call issued is a get-next-within-parent call.
13 The SSA1 variable is reset to CHCKACCT, so the next call is for CHCKACCT.
14 This DO/WHILE statement initiates a DO-loop (statements 15 through 20) that iterates as long as blank status codes are returned.
15 Again, the only specification in this DL/I INPUT statement is the trailing @ sign. When the statement executes, the GNP call is issued for a CHCKACCT segment. If the call is successful, a CHCKACCT segment is retrieved and placed in the input buffer. The segment data is not moved to SAS variables in the program data vector. Instead, the segment is held in the input buffer for the next DL/I INPUT statement that executes.
16 If a blank status code is returned, the GNP call was successful, and a DO-group (statements 17 and 18) executes.
17 This DL/I INPUT statement moves the CHECK_AMOUNT value (in the PD5.2 format) from the input buffer to a SAS variable in the program data vector.
18 The variable CHCK_BAL is assigned a new value by adding the value of CHECK_AMOUNT just obtained from the CHCKACCT segment.
19 The END statement signals the end of the DO-group.
20 This END statement ends the DO-loop.
21 If the GNP call is not successful and returns a non-blank status code other than GE, the DATA step stops and the job abends.
22 If the GNP call is not successful and returns a GE status code, the remainder of the step executes. (The GE status code indicates that all checking accounts for the customer have been processed.) In this statement, the ST= variable is reset to blanks.
23 _ERROR_ is reset to 0 to prevent SAS from printing the contents of the input buffer and program data vector to the SAS log.
24 The blank INPUT statement releases the hold placed on the input buffer by the last INPUT @ statement. This enables you to issue another call with the next DL/I INPUT statement.
25 The SSA1 variable is reset to SAVEACCT, so the next call is qualified for SAVEACCT.
26 This DO/WHILE statement initiates a DO loop that is identical to the one described in items 14 through 20, except that the GNP calls retrieve SAVEACCT segments rather than CHCKACCT segments. The GNP calls also update SAVE_BAL.
27 The ABENDIT code, if linked to, cancels the DATA step.
28 The PROC PRINT step prints the Balances data set created by the IMS DATA step.
The following output shows the results of this example.
Results of Using the Blank INPUT Statement
                    Customer Balances

                                          soc_sec_
          OBS    chck_bal    save_bal      number

          1     3005.60      784.29    667-73-8275
          2      826.05     8406.00    434-62-1234
          3      220.11      809.45    436-42-6394
          4     2392.93     9552.43    434-62-1224
          5        0.00        0.00    232-62-2432
          6     1404.90      950.96    178-42-6534
          7        0.00        0.00    131-73-2785
          8      353.65      136.40    156-45-5672
          9     1243.25      845.35    657-34-3245
         10     7462.51      945.25    667-82-8275
         11      608.24      929.24    456-45-3462
         12      672.32        0.00    234-74-4612

Example 9: Using the Qualified SSA

In this example, path calls with qualified SSAs are used to produce a report showing which accounts in the AcctDBD database had checking account debits on March 28, 1995. The numbered comments following this program correspond to the numbered statements in the program:
filename tranrept 'your.sas.tranrept' disp=old;
data _null_;
1   retain ssa1 'CHCKACCT*D '
           ssa2 'CHCKDEBT(DEBTDATE =032895) ';
2   infile acctsam dli ssa=(ssa1,ssa2) status=st 
            pcbno=4;
3   input @1   check_account_number $char12.
           @13  check_amount         pd5.2
           @18  check_date           mmddyy8.
           @26  check_balance        pd5.2
           @41  check_debit_amount   pd5.2
           @46  check_debit_date     mmddyy8.
           @54  check_debit_time     time8.
           @62  check_debit_desc     $char40.;
4   if st ¬= '  ' and
        st ¬= 'CC' and
        st ¬= 'GA' and
        st ¬= 'GK' then
5         if st = 'GB' | st = 'GE' then
            do;
               _error_ = 0;
               stop;
            end;
6         else
            do;
               file log;
               put _all_;
7             abort;
            end;
8   file tranrept header=newpage notitles;
9   put @10 check_account_number
         @30 check_debit_amount dollar13.2
         @45 check_debit_time   time8.
         @55 check_debit_desc;
10   return;
11  newpage: put / @15 'Checking Account Debits 
                  Occurring on 03/28/95'
                // @08 'Account Number' @37 'Amount' 
                   @49 'Time' @55 'Description' //;
12   return;
run;
filename tranrept clear;
1 The RETAIN statement specifies values for the two SSA variables, SSA1 and SSA2.
SSA1 is an SSA for the CHCKACCT segment with the command code for a path call, *D. This command code means that the CHCKACCT segment is returned as well as the target segment, CHCKDEBT. SSA2 is a qualified SSA specifying that CHCKDEBT segments for which DEBTDATE=032895 be retrieved.
These values are retained for each iteration of the DATA step. The RETAIN statement satisfies the requirement that the length of an SSA variable be specified before the DL/I INFILE statement.
2 The INFILE statement specifies ACCTSAM as the PSB. The DLI specification tells SAS that the step accesses DL/I resources. Two variables containing SSAs are identified by the SSA= option, SSA1 and SSA2. (Their values were set by the earlier RETAIN statement.) The STATUS= option specifies the ST variable for status codes returned by DL/I, and the PCBNO= option specifies the PCB selection.
These defaults are in effect for the other DL/I INFILE options: all calls are get-next calls, the input buffer length is 1000, and the segment names and PCB mask data are not returned.
3 When the DL/I INPUT statement executes, the GN call is issued. If successful, CHCKACCT and CHCKDEBT segments are placed in the input buffer, and the values are then moved to SAS variables in the program data vector. The DL/I INPUT statement specifies positions and informats for the variables in both the CHCKACCT and CHCKDEBT segments because the path call returns both segments.
4 If the qualified GN call issued by the DL/I INPUT statement is not successful (that is, it obtains any return code other than blank, CC, GA, or GK), _ERROR_ is set to 1 and the program does further checking.
5 If the ST variable value is GB (a status code meaning that the end-of-file has been reached) or GE (segment not found), _ERROR_ is reset to 0 so that the contents of the input buffer and program data vector are not printed to the SAS log, and SAS stops processing the DATA step. In a program issuing path calls with qualified SSAs, DL/I might first return a GE status code when it reaches end-of-file. Then, if another get call is issued, DL/I returns the GB status code. Therefore, in this program, treat a GE code as a GB code.
In a sequential-access program with unqualified SSAs, this statement is not necessary because the end-of-file condition stops processing automatically. However, when a program uses qualified SSAs, the end-of-file condition is not set on because DL/I might not be at the end of the database. Therefore, you need to check status codes and stop the step.
6 For any other non-blank return code, all values from the program data vector are written to the SAS log.
7 The DATA step execution terminates, and the job abends.
8 If the GN call is successful, the step goes on to execute another FILE statement. This is not a DL/I FILE statement. Instead, it specifies the fileref (TRANREPT) of an output file for a printed report on the retrieved segments.
The HEADER= option points to the NEWPAGE statement label (statement 11). When a new page begins, SAS links to the labeled statement and prints the specified heading.
9 The PUT statement specifies variables and positions to be written to the output file.
10 Execution returns to the beginning of the DATA step when this RETURN statement executes.
11 The PUT statement labeled NEWPAGE executes when a new page is started in the output file TRANREPT. This PUT statement writes the title for the report at the top of the new page.
12 After printing the heading, SAS returns to the PUT statement immediately after the FILE TRANREPT statement (statement 8) and continues execution of the step.