Overview of the IMS DATA Step Interface |
Complete IMS DATA step
examples are presented in this section.
Each example illustrates one or more of the concepts described earlier in
this section.
All of these examples are based on the sample databases,
DBDs, and PSBs described in Appendix 2. If you have not read the sample database
descriptions, you should do so before continuing this section.
It is assumed that the installation default values for
IMS DATA step system options are the same as the default values described
in Appendix 1. Statement options used in these examples that are not IMS
DATA step statement extensions (for example, the HEADER= option in the FILE
statement) are described in
SAS Language Reference: Dictionary.
This
example produces a report that shows the distribution of checking account
balances by ZIP code in the ACCTDBD database.
SAS data set DISTRIBC is created from data in the CUSTOMER and CHCKACCT segments.
The segments are retrieved with get-next calls using an unqualified SSA for
the CUSTOMER segment with an *D command code and an SSA for the CHCKACCT segment.
Thus, both the CUSTOMER and CHCKACCT segments are returned. The new SAS data
set contains three variables: CHECK_AMOUNT (from the CHCKACCT segment), ZIPRANGE
(created from the CUSTZIP value in the CUSTOMER segment), and BALRANGE (created
from the BALANCE variable). The distribution information is produced by the
TABULATE procedure from the DISTRIBC data set.
The numbered comments following this program correspond
to the numbered statements in the program:
1 data distribc;
2 length ziprange $11;
3 keep ziprange
check_amount
balrange;
4 retain ssa1 'CUSTOMER*D '
ssa2 'CHCKACCT ';
5 infile acctsam dli ssa=(ssa1,ssa2) status=st
pcbno=3;
6 input @162 zip_code $char10.
@238 check_amount pd5.2;
7 if st ¬= ' ' and
st ¬= 'CC' and
st ¬= 'GA' and
st ¬= 'GK' then
8 if st = 'GE' then
do;
_error_ = 0;
stop;
end;
9 else
do;
file log;
put _all_;
10 abort;
end;
11 balrange=check_amount;
12 ziprange=substr(zip_code,1,4)
||'0-'||substr(zip_code,1,4)||'9';
title 'Checking Account Balance Distribution
By ZIP Code';
13 proc format;
value balrang
low-249.99 = 'under $250'
250.00-1000.00 = '$250 - $1000'
1000.01-high = 'over $1000';
14 proc tabulate data=distribc;
class ziprange balrange;
var check_amount;
label balrange='balance range';
label ziprange='ZIP code range';
format ziprange $char11. balrange balrang.;
keylabel sum= '$ total' mean ='$ average'
n='# of accounts';
table ziprange*(balrange all),
check_amount*(sum*f=14.2 mean*f=10.2 n*f=4);
run;
|
The DATA statement specifies DISTRIBC
as the name of the SAS data set created by this DATA step. |
|
The length of the new variable ZIPRANGE
is set. |
|
The new data set will contain only
the three variables specified in the KEEP statement. |
|
The RETAIN statement specifies values
for the two SSA variables, SSA1 and SSA2. SSA1 is an unqualified SSA for the
CUSTOMER segment with the command code for a path call, *D. This command code
means that the CUSTOMER segment is returned along with the CHCKACCT segment
that is its child. SSA2 is an unqualified SSA for the CHCKACCT segment. Without
the *D command code in SSA1, only the target segment, CHCKACCT, would be returned.
These values are retained for each iteration of the
DATA step. The RETAIN statement, which initializes the variables, satisfies
the requirement that the length of an SSA variable be specified before the
DL/I INFILE statement is executed. |
|
The INFILE statement specifies ACCTSAM
as the PSB. The DLI specification tells SAS that the step will access DL/I resources.
Two variables containing SSAs are identified by the SSA= option, SSA1 and
SSA2. Their values were set by the earlier RETAIN statement. The STATUS= option
specifies the ST variable for status codes returned by DL/I. The PCBNO= option
specifies which PCB to use.
These defaults are in effect for the other DL/I INFILE
options: all calls are get-next calls, the input buffer has a length of 1000
bytes, and the segment, and PCB mask data are not returned. No qualified
SSAs are used; therefore, program access is sequential. |
|
The DL/I INPUT statement specifies
positions and informats for the necessary variables in both the CUSTOMER and
CHCKACCT segments because the path call returns both segments. When this statement
executes, the GN call is issued. If successful, CUSTOMER and CHCKACCT segments
are placed in the input buffer and the ZIP_CODE and CHECK_AMOUNT values are
then moved to SAS variables in the program data vector. |
|
If the qualified GN call issued by
the DL/I INPUT statement is not successful
(that is, it obtains any return code other than blank, CC
, GA
, or GK
), the automatic SAS variable _ERROR_ is set to
1 and the DO group (statements 8 through 10) is executed. |
|
If the ST variable value is GE (a
status code meaning that the segment or segments were not found), SAS stops
execution of the DATA step. _ERROR_ is reset to 0 so that the contents of
the input buffer and program data vector are not printed on the SAS log. This
statement is included because of a DL/I feature. In a program issuing path
calls, DL/I sometimes returns a GE
status code when it reaches end-of-database. The GB (end-of-database) code
is returned if another get call is issued after the GE code. Therefore, in
this program, the GE code can be considered the end-of-file signal rather
than an error condition. |
|
For any other non-blank status code,
all values from the program data vector are written to the SAS log. |
|
The DATA step execution terminates
and the job aborts. |
|
If the qualified GN call is successful,
BALRANGE is assigned the value of CHECK_AMOUNT. |
|
The ZIPRANGE variable is created
using the SUBSTR function with the ZIP_CODE variable. |
|
PROC FORMAT is invoked to create
a format for the BALRANGE variable. These formats are used in the PROC TABULATE
output. |
|
PROC TABULATE is invoked to process
the DISTRIBC data set. |
The following output shows the results of this example.
Results of Issuing Path Calls
Checking Account Balance Distribution By ZIP code
----------------------------------------------------------------
| | check_amount |
| |------------------------------|
| | | |# of|
| | | |acc-|
| | | |oun-|
| | $ total |$ average | ts |
|-------------------------------+--------------+----------+----|
|ZIP code range |balance range | | | |
|---------------+---------------| | | |
|22210-22219 |over $1000 | 4410.50| 1470.17| 3|
| |---------------+--------------+----------+----|
| |All | 4410.50| 1470.17| 3|
|---------------+---------------+--------------+----------+----|
|25800-25809 |balance range | | | |
| |---------------| | | |
| |over $1000 | 8705.76| 4352.88| 2|
| |---------------+--------------+----------+----|
| |All | 8705.76| 4352.88| 2|
|---------------+---------------+--------------+----------+----|
|26000-26009 |balance range | | | |
| |---------------| | | |
| |under $250 | 220.11| 110.06| 2|
| |---------------+--------------+----------+----|
| |$250 - $1000 | 826.05| 826.05| 1|
| |---------------+--------------+----------+----|
| |over $1000 | 2392.93| 2392.93| 1|
| |---------------+--------------+----------+----|
| |All | 3439.09| 859.77| 4|
|---------------+---------------+--------------+----------+----|
|26040-26049 |balance range | | | |
| |---------------| | | |
| |$250 - $1000 | 353.65| 353.65| 1|
| |---------------+--------------+----------+----|
| |All | 353.65| 353.65| 1|
|---------------+---------------+--------------+----------+----|
|26500-26509 |balance range | | | |
| |---------------| | | |
| |$250 - $1000 | 1280.56| 640.28| 2|
| |---------------+--------------+----------+----|
| |All | 1280.56| 640.28| 2|
----------------------------------------------------------------
This example
uses GHN calls to retrieve CUSTOMER segments and
then tests the values of the STATE and COUNTRY fields. If a segment has a
valid value for STATE but does not have COUNTRY='UNITED STATES', the COUNTRY
value is changed to UNITED STATES and the corrected segment is replaced using
a REPL call.
Follow the notes corresponding to the numbered statements
in the following code for a detailed explanation of this example:
filename tranrept '<your.sas.tranrept>' disp=old;
data _null_;
1 length ssa1 $ 9;
2 infile acctsam dli ssa=ssa1 call=func pcbno=4
status=st;
3 func = 'GHN ';
4 ssa1 = 'CUSTOMER';
5 input @12 customer_name $char40.
@140 state $char2.
@142 country $char20.;
6 if st ¬= ' ' and
st ¬= 'CC' and
st ¬= 'GA' and
st ¬= 'GK' then
link abendit;
7 if country ¬= 'UNITED STATES' &
state < 'Z ' &
state > 'A ' then
do;
8 oldland = country;
9 country = 'UNITED STATES';
10 file acctsam dli;
11 func = 'REPL';
12 ssa1 = ' ';
13 put @1 _infile_
@142 country;
14 if st ¬= ' ' then
link abendit;
15 file tranrept header=newpage notitles;
16 put @10 customer_name
@60 state
@65 oldland;
17 end;
18 return;
19 newpage: put / @15
'Customers Whose Country was Changed to
UNITED STATES'
// @17 'Name' @58 'State' @65 'old Country';
20 return;
abendit:
file log;
put _all_;
abort;
run;
filename tranrept clear;
|
The length of SSA1, an SSA variable
specified in the INFILE statement, is set before execution of the DL/I INFILE
statement, as required. |
|
The INFILE statement specifies ACCTSAM
as the PSB, and the DLI specification tells SAS that this step will access
DL/I resources. The SSA= option identifies SSA1 as a variable that contains
a Segment Search Argument. (The length of SSA1 was established by the LENGTH
statement.) The CALL= option specifies FUNC as the variable containing DL/I
call functions, and STATUS is used to return the status code. The value of
PCBNO is used to select the appropriate PCB for this program. This value is
carried over in successive executions of the DATA step.
These defaults are in effect for other DL/I INFILE options:
the input and output buffers are 1000 bytes in length, and segment names
and PCB mask data are not returned. Program access is sequential. |
|
The FUNC variable is assigned a value
of GHN
, so the next DL/I INPUT statement issues a get-hold-next call. |
|
The SSA1 variable is assigned a value
of CUSTOMER
. The GHN call is qualified to retrieve a CUSTOMER segment. |
|
The DL/I INPUT statement specifies
positions and informats for some of the fields in the CUSTOMER segment. When
this statement executes, a qualified GHN call is issued. If the call is successful,
a CUSTOMER segment is retrieved and placed in the input buffer. Since variables
are named in the INPUT statement, the segment data is moved to SAS variables
in the program data vector. |
|
When a call is not successful (that
is, when the DL/I status code is something other than blank, CC
, GA
, or GK
), the automatic SAS variable _ERROR_ is set to
1. If the status code is set to GB
(indicating end of database),
and if the DATA step is processing sequentially (as this one is), the DATA
step is stopped automatically with an end-of-file return code sent to SAS. |
|
If the call is successful, the values
of COUNTRY and STATE are checked. If COUNTRY is not UNITED STATES
, and the STATE value is alphabetic, a DO group (statements 8 through
17) executes. |
|
The value of COUNTRY is assigned
to a new variable called OLDLAND. |
|
COUNTRY's value is changed to UNITED STATES
. |
|
A DL/I FILE statement indicates that
an update call is to be issued. Notice that the FILE statement specifies the
same PSB named in the DL/I INFILE statement, as required. |
|
The value of FUNC is changed from
GHN to REPL. If the FUNC value is not changed, an update call cannot be issued. |
|
The value of SSA1 is changed from
CUSTOMER to blanks. Since the REPL call uses the segment retrieved by the
GHN call, an SSA is not needed. |
|
The DL/I PUT statement formats the
CUSTOMER segment in the output buffer and issues the REPL call. The entire
segment must be formatted, even though the value of only one field, COUNTRY,
is changed. |
|
If the REPL call is not successful
(that is, the status code from DL/I was not blank), all values from the program
data vector are written to the SAS log and the DATA step aborts. |
|
If the REPL call is successful, the
step goes on to execute another FILE statement. This is not a DL/I FILE statement;
instead, it specifies the fileref (TRANREPT) of an output file for a printed
report on the replaced segments. The HEADER= option points to the NEWPAGE
subroutine. Each time a new page of the update report is started, SAS links
to NEWPAGE and executes the statement. |
|
The PUT statement specifies variables
and positions to be written to the TRANREPT output file. |
|
The DO group is terminated by the
END statement. |
|
Execution returns to the beginning
of the DATA step when this RETURN statement executes. |
|
This PUT statement executes when
a new page starts in the output file TRANREPT. The HEADER= option in the FILE
TRANREPT statement points to the NEWPAGE label, so when a new page begins,
SAS links to this labeled statement and prints the specified heading. |
|
After printing the heading, SAS returns
to the PUT statement immediately after the FILE TRANREPT statement (item 16)
and continues execution of the step. |
This
program calculates customer balances by retrieving a CUSTOMER
segment and then all CHCKACCT and SAVEACCT segments for that customer record.
The CUSTOMER segments are retrieved by qualified get-next calls, and the CHCKACCT
and SAVEACCT segments are retrieved by qualified get-next-within-parent calls.
A GE
or GB
status when retrieving the CHCKACCT and SAVEACCT
segments indicates that there are no more of that segment type for the current
parent segment (CUSTOMER).
The numbered comments following this program correspond
to the numbered statements in the program:
1 data balances;
2 length ssa1 $9;
3 keep soc_sec_number
chck_bal
save_bal;
4 chck_bal = 0;
save_bal = 0;
5 infile acctsam dli pcbno=4 call=func ssa=ssa1
status=st;
6 func = 'GN ';
7 ssa1 = 'CUSTOMER ';
8 input @;
9 if st ¬= ' ' and
st ¬= 'CC' and
st ¬= 'GA' and
st ¬= 'GK' then
link abendit;
10 input @1 soc_sec_number $char11.;
11 st = ' ';
12 func = 'GNP ';
13 ssa1 = 'CHCKACCT ';
14 do while (st = ' ');
15 input @;
16 if st = ' ' then
do;
17 input @13 check_amount pd5.2;
18 chck_bal=chck_bal + check_amount;
19 end;
20 end;
21 if st ¬= 'GE' then
link abendit;
22 st = ' ';
23 _error_ = 0;
24 input;
25 ssa1 = 'SAVEACCT ';
26 do while (st = ' ');
input @;
if st = ' ' then
do;
input @13 savings_amount pd5.2;
save_bal = save_bal + savings_amount;
end;
end;
if st = 'GE' then
_error_ = 0;
else
link abendit;
return;
27 abendit:
file log;
put _all_;
abort;
run;
28 proc print data=balances;
title2 'Customer Balances';
run;
|
The DATA step creates a new SAS data
set called BALANCES. |
|
The length of SSA1, an SSA variable
specified in the INFILE statement, is set before execution of the DL/I INFILE
statement, as required. |
|
The KEEP statement tells SAS that
the variables SOC_SEC_NUMBER, CHCK_BAL, and SAVE_BAL are the only variables
to be included in the BALANCES data set. |
|
The CHCK_BAL and SAVE_BAL variables
are assigned an initial value of 0 and are reset to 0 for each new customer. |
|
The INFILE statement specifies ACCTSAM
as the PSB, and the DLI specification tells SAS that this step will access
DL/I resources. The SSA= option identifies SSA1 as a variable that contains
an SSA. (The length of SSA1 was established by the LENGTH statement.) The
CALL= option specifies FUNC as the variable containing DL/I call functions,
and the PCBNO= option specifies which database PCB should be used.
These defaults are in effect for the other DL/I INFILE
statement options: the input buffer is 1000 bytes in length, and segment names
and PCB mask data are not returned. There are no qualified SSAs in the program,
so access is sequential. |
|
The FUNC variable is assigned a value
of GN
, so the next DL/I INPUT statement will issue a get-next call. |
|
The SSA1 variable is assigned a value
of CUSTOMER, so the GN call will retrieve the CUSTOMER segment. |
|
The only specification in the DL/I
INPUT statement is the trailing @ sign. When the statement executes, the GN
call is issued and, if the call is successful, a CUSTOMER segment is retrieved
and placed in the input buffer. Since no variables are named in the INPUT
statement, the segment data is not moved to SAS variables in the program data
vector. Instead, the segment is held in the input buffer for the next DL/I
INPUT statement that executes (that is, the next DL/I INPUT
statement does not issue a call but uses the data already in the buffer). |
|
When a call is not successful (that
is, when the DL/I status code is something other than blank, CC
, GA
, or GK
), the automatic SAS variable _ERROR_ is set to
1. If the status code is set to GB
(indicating end of database)
and if the DATA step is processing sequentially (as this one is), the DATA
step is stopped automatically with an end-of-file return code sent to SAS. |
|
If the call is successful, this DL/I
INPUT statement executes. It moves the SOC_SEC_NUMBER value from the input
buffer (where the segment was placed by the previous DL/I INPUT statement)
to a SAS variable in the program data vector. |
|
The value of the ST variable for
status codes is reset to blanks. |
|
The value of the FUNC variable is
reset to GNP
. The next call issued will be a get-next-within-parent
call. |
|
The SSA1 variable is reset to CHCKACCT
, so the next call will be for CHCKACCT. |
|
This DO/WHILE statement initiates
a DO-loop (statements 15 through 20) that iterates as long as blank status
codes are returned. |
|
Again, the only specification in
this DL/I INPUT statement is the trailing @ sign. When the statement executes,
the GNP call is issued for a CHCKACCT segment. If the call is successful,
a CHCKACCT segment is retrieved and placed in the input buffer. The segment
data is not moved to SAS variables in the program data vector. Instead, the
segment is held in the input buffer for the next DL/I INPUT statement that
executes. |
|
If a blank status code is returned,
the GNP call was successful, and a DO-group (statements 17 and 18) executes. |
|
This DL/I INPUT statement moves the
CHECK_AMOUNT value (in the PD5.2 format) from the input buffer to a SAS variable
in the program data vector. |
|
The variable CHCK_BAL is assigned
a new value by adding the value of CHECK_AMOUNT just obtained from the CHCKACCT
segment. |
|
The END statement signals the end
of the DO-group. |
|
This END statement ends the DO-loop. |
|
If the GNP call is not successful
and returns a non-blank status code other than GE
, the DATA step stops
and the job abends. |
|
If the GNP call is not successful
and returns a GE
status code, the remainder of the step executes.
(The GE
status code indicates that all checking accounts
for the customer have been processed.) In this statement, the ST= variable
is reset to blanks. |
|
_ERROR_ is reset to 0 to prevent
SAS from printing the contents of the input buffer and program data vector
to the SAS log. |
|
The blank INPUT statement releases
the hold placed on the input buffer by the last INPUT @ statement. This enables
you to issue another call with the next DL/I INPUT statement. |
|
The SSA1 variable is reset to SAVEACCT
, so the next call will be qualified for SAVEACCT. |
|
This DO/WHILE statement initiates
a DO loop that is identical to the one described in items 14 through 20, except
that the GNP calls retrieve SAVEACCT segments rather than CHCKACCT segments.
The GNP calls also update SAVE_BAL. |
|
The ABENDIT code, if linked to, aborts
the DATA step. |
|
The PROC PRINT step prints the BALANCES
data set created by the IMS DATA step. |
The following output shows the results of this example.
Results of Using the Blank INPUT Statement
Customer Balances
soc_sec_
OBS chck_bal save_bal number
1 3005.60 784.29 667-73-8275
2 826.05 8406.00 434-62-1234
3 220.11 809.45 436-42-6394
4 2392.93 9552.43 434-62-1224
5 0.00 0.00 232-62-2432
6 1404.90 950.96 178-42-6534
7 0.00 0.00 131-73-2785
8 353.65 136.40 156-45-5672
9 1243.25 845.35 657-34-3245
10 7462.51 945.25 667-82-8275
11 608.24 929.24 456-45-3462
12 672.32 0.00 234-74-4612
In this example, path calls with qualified SSAs are used to produce a report
showing which accounts in the ACCTDBD database had checking account debits
on March 28, 1995. The numbered comments following this program correspond
to the numbered statements in the program:
filename tranrept 'your.sas.tranrept' disp=old;
data _null_;
1 retain ssa1 'CHCKACCT*D '
ssa2 'CHCKDEBT(DEBTDATE =032895) ';
2 infile acctsam dli ssa=(ssa1,ssa2) status=st
pcbno=4;
3 input @1 check_account_number $char12.
@13 check_amount pd5.2
@18 check_date mmddyy8.
@26 check_balance pd5.2
@41 check_debit_amount pd5.2
@46 check_debit_date mmddyy8.
@54 check_debit_time time8.
@62 check_debit_desc $char40.;
4 if st ¬= ' ' and
st ¬= 'CC' and
st ¬= 'GA' and
st ¬= 'GK' then
5 if st = 'GB' | st = 'GE' then
do;
_error_ = 0;
stop;
end;
6 else
do;
file log;
put _all_;
7 abort;
end;
8 file tranrept header=newpage notitles;
9 put @10 check_account_number
@30 check_debit_amount dollar13.2
@45 check_debit_time time8.
@55 check_debit_desc;
10 return;
11 newpage: put / @15 'Checking Account Debits
Occurring on 03/28/95'
// @08 'Account Number' @37 'Amount'
@49 'Time' @55 'Description' //;
12 return;
run;
filename tranrept clear;
|
The RETAIN statement specifies values
for the two SSA variables, SSA1 and SSA2.
SSA1 is an SSA for the CHCKACCT segment with the command
code for a path call, *D. This command code means that the CHCKACCT segment
is returned as well as the target segment, CHCKDEBT. SSA2 is a qualified SSA
specifying that CHCKDEBT segments for which DEBTDATE=032895 be retrieved.
These values are retained for each iteration of the
DATA step. The RETAIN statement satisfies the requirement that the length
of an SSA variable be specified before the DL/I
INFILE statement. |
|
The INFILE statement specifies ACCTSAM
as the PSB. The DLI specification tells SAS that the step will access DL/I
resources. Two variables containing SSAs are identified by the SSA= option,
SSA1 and SSA2. (Their values were set by the earlier RETAIN statement.) The
STATUS= option specifies the ST variable for status codes returned by DL/I,
and the PCBNO= option specifies the PCB selection.
These defaults are in effect for the other DL/I INFILE
options: all calls are get-next calls, the input buffer length is 1000, and
the segment names and PCB mask data are not returned. |
|
When the DL/I INPUT statement executes,
the GN call is issued. If successful, CHCKACCT and CHCKDEBT segments are placed
in the input buffer, and the values are then moved to SAS variables in the
program data vector. The DL/I INPUT statement specifies positions and informats
for the variables in both the CHCKACCT and CHCKDEBT segments because the path
call returns both segments. |
|
If the qualified GN call issued by
the DL/I INPUT statement is not successful (that is, it obtains any return
code other than blank, CC
, GA
, or GK
),
_ERROR_ is set to 1 and the program does further checking. |
|
If the ST variable value is GB
(a status code meaning that the end-of-file has been reached) or GE
(segment not found), _ERROR_ is reset to 0 so that the contents
of the input buffer and program data vector are not printed to the SAS log,
and SAS stops processing the DATA step. In a program issuing path calls with
qualified SSAs, DL/I might first return
a GE status code when it reaches end-of-file. Then, if another get call is
issued, DL/I returns the GB status code. Therefore, in this program, we treat
a GE code as a GB code.
In a sequential-access program with unqualified SSAs,
this statement is not necessary because the end-of-file condition stops processing
automatically. However, when a program uses qualified SSAs, the end-of-file
condition is not set on because DL/I might not be at the end of the database.
Therefore, you need to check status codes and stop the step. |
|
For any other non-blank return code,
all values from the program data vector are written to the SAS log. |
|
The DATA step execution terminates,
and the job abends. |
|
If the GN call is successful, the
step goes on to execute another FILE statement. This is not a DL/I FILE statement.
Instead, it specifies the fileref (TRANREPT) of an output file for a printed
report on the retrieved segments.
The HEADER= option points to the NEWPAGE statement label
(statement 11). When a new page begins, SAS links to the labeled statement
and prints the specified heading. |
|
The PUT statement specifies variables
and positions to be written to the output file. |
|
Execution returns to the beginning
of the DATA step when this RETURN statement executes. |
|
The PUT statement labeled NEWPAGE
executes when a new page is started in the output file TRANREPT. This PUT
statement writes the title for the report at the top of the new page. |
|
After printing the heading, SAS returns
to the PUT statement immediately after the FILE TRANREPT statement (statement
8) and continues execution of the step. |
|
|
Copyright © 2007 by SAS Institute Inc., Cary, NC, USA. All rights reserved.