Calculating Statistics with IMS Data

Calculating Statistics Using the FREQ Procedure

Suppose you want to find the percentages of your accounts in each city where you have a bank so that you can decide where to increase your marketing. The following example calculates the percentages of customers for each city appearing in the IMS database AcctDBD using the view descriptor Vlib.CustInfo:
options nodate linesize=80;

proc freq data=vlib.custinfo;
   table city;
   title2 'Cities in the ACCTDBD Database';
run;
The following output shows the one-way frequency table that this example generates.
Results of Calculating Statistics Using the FREQ Procedure
                               The SAS System
                            Cities in the ACCTDBD Database


                                         CITY

                                                    Cumulative  Cumulative
            CITY               Frequency   Percent   Frequency    Percent
            --------------------------------------------------------------
            CHARLOTTESVILLE           2      20.0           2       20.0
            GORDONSVILLE              3      30.0           5       50.0
            ORANGE                    2      20.0           7       70.0
            RAPIDAN                   1      10.0           8       80.0
            RICHMOND                  2      20.0          10      100.0
For more information about the FREQ procedure, see SAS Language Reference: Concepts and the Base SAS Procedures Guide.

Calculating Statistics Using the MEANS Procedure

In an analysis of recent accounts, suppose that you also want to determine some statistics by customer. In the following example, PROC MEANS is used to generate the mean debit amount for each customer (including the number of observations (N) and the number of missing values (NMISS)):
proc sort data=vlib.trans out=mydata.trandata;
   by soc_sec_number;
run;

options nodate linesize=80;

proc means data=mydata.trandata mean 
   sum n nmiss maxdec=0;
   by soc_sec_number;
   var check_debit_amount;
   title2 'Mean Debit Amount Per Customer';
run;
In the example, the view descriptor Vlib.Trans selects CUSTOMER, CHCKACCT, and CHCKDEBT segment data from the IMS database AcctDBD. Since the AcctDBD database is an HDAM and therefore is not indexed, the data that is described by the view descriptor must be sorted before using PROC MEANS. The sorted data is stored in a SAS data file called MyData.TranData, which is then used as input to PROC MEANS.
If your database is indexed, you can use a SAS BY statement for the indexed field so that data is returned as if it is sorted. Database access methods HIDAM, HISAM, and SHISAM are indexed. If your database is not indexed, you need to sort the IMS data before using the MEANS procedure with a BY statement. Because you cannot sort data in an IMS database, you must use the OUT= option to extract data from the database so that you can pass it to the MEANS procedure.
Note: You can store the sorted data in a temporary data set if space is a concern.
Note: If the view descriptor describes a path of data that includes segments from multiple hierarchical levels, the parent segment information is repeated for each SAS observation. This can cause misleading statistical results. To avoid misleading results, perform mathematical operations using only the data in the segment at the lowest hierarchical level. You can also avoid misleading results by creating a view descriptor that describes only the data in the segment at the lowest hierarchical level.
The following output shows the output for this example.
Results of Calculating Statistics Using the MEANS Procedure
                                 The SAS System                                
                            Mean Debit Amount Per Customer

   -------------------------- SOC_SEC_NUMBER=156-45-5672--------------------------

                              The MEANS Procedure

            Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT
 
                                                            N
                           Mean             Sum     N    Miss
                   ------------------------------------------
                             27             110     4       0
                   ------------------------------------------


   -------------------------- SOC_SEC_NUMBER=178-42-6534--------------------------

            Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT
 
                                                            N
                           Mean             Sum     N    Miss
                   ------------------------------------------
                             26              26     1       0
                   ------------------------------------------


   -------------------------- SOC_SEC_NUMBER=234-74-4612--------------------------

            Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT
 
                                                            N
                           Mean             Sum     N    Miss
                   ------------------------------------------
                              .               .     0       1
                   ------------------------------------------


   -------------------------- SOC_SEC_NUMBER=434-62-1224--------------------------

            Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT
 
                                                            N
                           Mean             Sum     N    Miss
                   ------------------------------------------
                            162            1620    10       0
                   ------------------------------------------


                                 The SAS System                                
                            Mean Debit Amount Per Customer

   -------------------------- SOC_SEC_NUMBER=434-62-1234--------------------------

                              The MEANS Procedure

            Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT
 
                                                            N
                           Mean             Sum     N    Miss
                   ------------------------------------------
                              .               .     0       1
                   ------------------------------------------


   -------------------------- SOC_SEC_NUMBER=436-42-6394--------------------------

            Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT
 
                                                            N
                           Mean             Sum     N    Miss
                   ------------------------------------------
                              .               .     0       1
                   ------------------------------------------

   -------------------------- SOC_SEC_NUMBER=456-45-3462--------------------------

            Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT
 
                                                            N
                           Mean             Sum     N    Miss
                   ------------------------------------------
                             66             263     4       0
                   ------------------------------------------


   -------------------------- SOC_SEC_NUMBER=657-34-3245--------------------------

            Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT
 
                                                            N
                           Mean             Sum     N    Miss
                   ------------------------------------------
                              .               .     0       1
                   ------------------------------------------


   -------------------------- SOC_SEC_NUMBER=667-73-8275--------------------------

                              The MEANS Procedure

            Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT
 
                                                            N
                           Mean             Sum     N    Miss
                   ------------------------------------------
                            355            1065     3       2
                   ------------------------------------------


   -------------------------- SOC_SEC_NUMBER=667-82-8275--------------------------

            Analysis Variable : CHECK_DEBIT_AMOUNT CHECK_DEBIT_AMOUNT
 
                                                            N
                           Mean             Sum     N    Miss
                   ------------------------------------------
                              .               .     0       1
                   ------------------------------------------
For more information about PROC MEANS, see SAS Language Reference: Concepts and the Base SAS Procedures Guide.

Calculating Statistics Using the RANK Procedure

You can use advanced statistical procedures on IMS data that is described by a view descriptor just as you would using a SAS data file. The following example uses the RANK procedure to rank checking account deposits by amount. It also assigns the variable name CRDRANK to the new item created by the RANK procedure, extracts and sorts the data, and prints the sorted output data. The view descriptor Vlib.Credits describes the CUSTOMER, CHCKACCT, and CHCKCRDT segments in the AcctDBD database.
proc rank data=vlib.credits out=mydata.rankcred;
   var check_credit_amount;
   ranks crdrank;
run;

proc sort data=mydata.rankcred;
   by crdrank;
run;

options nodate linesize=132;

proc print data=mydata.rankcred;
   title2 'Deposits in Ascending Order';
run;
The following output shows the result of this example.
Results of Calculating Statistics Using the RANK Procedure
                                               The SAS System                   
                                      
                                          Deposits in Ascending Order

                                                     CHECK_   CHECK_   CHECK_       CHECK_
                      SOC_SEC_    CHECK_ACCOUNT_    CREDIT_  CREDIT_  CREDIT_      CREDIT_
                OBS   NUMBER          NUMBER         AMOUNT    DATE     TIME         DESC

                1   436-42-6394   345620135872        50.00  02APR95  12:16:34 ACH DEPOSIT
                2   456-45-3462   345620134522        50.00  05APR95  12:14:52 ACH DEPOSIT
                3   156-45-5672   345620123456       100.00  01APR95  12:24:34 ATM DEPOSIT
                4   667-82-8275   382957492811       100.00  16APR95  09:21:14 ACH DEPOSIT
                5   434-62-1224   345620134663       120.00  28MAR95  10:26:45 ACH DEPOSIT
                6   657-34-3245   345620131455       230.00  04APR95  14:24:11 ACH DEPOSIT
                7   434-62-1234   345620104732       400.00  02APR95  10:23:46 ACH DEPOSIT
                8   234-74-4612   345620113263       672.32  31MAR95           ATM DEPOSIT 
                9   178-42-6534   745920057114      1300.00  12JUN95  14:34:12 ACH DEPOSIT
               10   434-62-1224   345620134564      1342.42  22MAR95  23:23:52 ACH DEPOSIT
               11   667-73-8275   345620145345      1563.23  31MAR95  15:42:43 MAIN ST BRANCH DEPOSIT
               12   667-73-8275   345620154633      1563.23  31MAR95  15:42:43 BAD ACCT_NUM
For more information about PROC RANK and other advanced statistics procedures, see the Base SAS Procedures Guide.