IMS Data in SAS Programs |
Calculating Statistics Using the FREQ Procedure |
Suppose you want to find the percentages of your accounts in each city where you have a bank so that you can decide where to increase your marketing. The following example calculates the percentages of customers for each city appearing in the IMS database ACCTDBD using the view descriptor VLIB.CUSTINFO:
options nodate linesize=80; proc freq data=vlib.custinfo; table city; title2 'Cities in the ACCTDBD Database'; run;
The following output shows the one-way frequency table that this example generates.
Results of Calculating Statistics Using the FREQ Procedure
The SAS System Cities in the ACCTDBD Database CITY Cumulative Cumulative CITY Frequency Percent Frequency Percent -------------------------------------------------------------- CHARLOTTESVILLE 2 20.0 2 20.0 GORDONSVILLE 3 30.0 5 50.0 ORANGE 2 20.0 7 70.0 RAPIDAN 1 10.0 8 80.0 RICHMOND 2 20.0 10 100.0For more information about the FREQ procedure, see SAS Language Reference: Concepts and the Base SAS Procedures Guide.
Calculating Statistics Using the MEANS Procedure |
In an analysis of recent accounts, suppose that you also want to determine some statistics by customer. In the following example, PROC MEANS is used to generate the mean debit amount for each customer (including the number of observations (N) and the number of missing values (NMISS)):
proc sort data=vlib.trans out=mydata.trandata; by soc_sec_number; run; options nodate linesize=80; proc means data=mydata.trandata mean sum n nmiss maxdec=0; by soc_sec_number; var check_debit_amount; title2 'Mean Debit Amount Per Customer'; run;
In the example, the view descriptor VLIB.TRANS selects CUSTOMER, CHCKACCT, and CHCKDEBT segment data from the IMS database ACCTDBD. Since the ACCTDBD database is an HDAM and therefore is not indexed, the data that is described by the view descriptor must be sorted before using PROC MEANS. The sorted data is stored in a SAS data file called MYDATA.TRANDATA, which is then used as input to PROC MEANS.
If your database is indexed, you can use a SAS BY statement for the indexed field so that data is returned as if it is sorted. Database access methods HIDAM, HISAM, and SHISAM are indexed. If your database is not indexed, you need to sort the IMS data before using the MEANS procedure with a BY statement. Because you cannot sort data in an IMS database, you must use the OUT= option to extract data from the database so that you can pass it to the MEANS procedure.
Note: You can store the sorted data in a temporary data set if space is a concern.
Note: If the view descriptor describes a path of data that includes segments from multiple hierarchical levels, the parent segment information is repeated for each SAS observation. This can cause misleading statistical results. To avoid misleading results, perform mathematical operations using only the data in the segment at the lowest hierarchical level. You can also avoid misleading results by creating a view descriptor that describes only the data in the segment at the lowest hierarchical level.
The following output shows the output for this example.
Results of Calculating Statistics Using the MEANS Procedure
The SAS System Mean Debit Amount Per Customer -------------------------- SOC_SEC_NUMBER=156-45-5672 -------------------------- The MEANS Procedure Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT N Mean Sum N Miss ------------------------------------------ 27 110 4 0 ------------------------------------------ -------------------------- SOC_SEC_NUMBER=178-42-6534 -------------------------- Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT N Mean Sum N Miss ------------------------------------------ 26 26 1 0 ------------------------------------------ -------------------------- SOC_SEC_NUMBER=234-74-4612 -------------------------- Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT N Mean Sum N Miss ------------------------------------------ . . 0 1 ------------------------------------------ -------------------------- SOC_SEC_NUMBER=434-62-1224 -------------------------- Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT N Mean Sum N Miss ------------------------------------------ 162 1620 10 0 ------------------------------------------
The SAS System Mean Debit Amount Per Customer -------------------------- SOC_SEC_NUMBER=434-62-1234 -------------------------- The MEANS Procedure Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT N Mean Sum N Miss ------------------------------------------ . . 0 1 ------------------------------------------ -------------------------- SOC_SEC_NUMBER=436-42-6394 -------------------------- Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT N Mean Sum N Miss ------------------------------------------ . . 0 1 ------------------------------------------ -------------------------- SOC_SEC_NUMBER=456-45-3462 -------------------------- Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT N Mean Sum N Miss ------------------------------------------ 66 263 4 0 ------------------------------------------ -------------------------- SOC_SEC_NUMBER=657-34-3245 -------------------------- Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT N Mean Sum N Miss ------------------------------------------ . . 0 1 ------------------------------------------ -------------------------- SOC_SEC_NUMBER=667-73-8275 -------------------------- The MEANS Procedure Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT N Mean Sum N Miss ------------------------------------------ 355 1065 3 2 ------------------------------------------ -------------------------- SOC_SEC_NUMBER=667-82-8275 -------------------------- Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT N Mean Sum N Miss ------------------------------------------ . . 0 1 ------------------------------------------For more information about PROC MEANS, see SAS Language Reference: Concepts and the Base SAS Procedures Guide.
Calculating Statistics Using the RANK Procedure |
You can use advanced statistical procedures on IMS data that is described by a view descriptor just as you would using a SAS data file. The following example uses the RANK procedure to rank checking account deposits by amount. It also assigns the variable name CRDRANK to the new item created by the RANK procedure, extracts and sorts the data, and prints the sorted output data. The view descriptor VLIB.CREDITS describes the CUSTOMER, CHCKACCT, and CHCKCRDT segments in the ACCTDBD database.
proc rank data=vlib.credits out=mydata.rankcred; var check_credit_amount; ranks crdrank; run; proc sort data=mydata.rankcred; by crdrank; run; options nodate linesize=132; proc print data=mydata.rankcred; title2 'Deposits in Ascending Order'; run;
The following output shows the result of this example.
Results of Calculating Statistics using the RANK Procedure
The SAS System Deposits in Ascending Order CHECK_ CHECK_ CHECK_ CHECK_ SOC_SEC_ CHECK_ACCOUNT_ CREDIT_ CREDIT_ CREDIT_ CREDIT_ OBS NUMBER NUMBER AMOUNT DATE TIME DESC 1 436-42-6394 345620135872 50.00 02APR95 12:16:34 ACH DEPOSIT 2 456-45-3462 345620134522 50.00 05APR95 12:14:52 ACH DEPOSIT 3 156-45-5672 345620123456 100.00 01APR95 12:24:34 ATM DEPOSIT 4 667-82-8275 382957492811 100.00 16APR95 09:21:14 ACH DEPOSIT 5 434-62-1224 345620134663 120.00 28MAR95 10:26:45 ACH DEPOSIT 6 657-34-3245 345620131455 230.00 04APR95 14:24:11 ACH DEPOSIT 7 434-62-1234 345620104732 400.00 02APR95 10:23:46 ACH DEPOSIT 8 234-74-4612 345620113263 672.32 31MAR95 ATM DEPOSIT 9 178-42-6534 745920057114 1300.00 12JUN95 14:34:12 ACH DEPOSIT 10 434-62-1224 345620134564 1342.42 22MAR95 23:23:52 ACH DEPOSIT 11 667-73-8275 345620145345 1563.23 31MAR95 15:42:43 MAIN ST BRANCH DEPOSIT 12 667-73-8275 345620154633 1563.23 31MAR95 15:42:43 BAD ACCT_NUMFor more information about PROC RANK and other advanced statistics procedures, see the Base SAS Procedures Guide.
Copyright © 2007 by SAS Institute Inc., Cary, NC, USA. All rights reserved.