Calculating Statistics for CA-Datacom/DB Data

Using FREQ, MEANS, and RANK Procedures

You can use statistical procedures with CA-Datacom/DB data that is described by view descriptors just as you would with SAS data files. This section shows simple examples using the FREQ and MEANS procedures and a more advanced example using the RANK procedure.

Using the FREQ Procedure

Suppose you want to find what percentage of your invoices went to each country so that you can decide where to increase your overseas marketing. The following example calculates the percentage of invoices for each country appearing in the CA-Datacom/DB table Invoice using the view descriptor Vlib.Inv:
proc freq data=vlib.inv;
   tables country;
   title 'Data Described by VLIB.INV';
run;
The following output shows the one-way frequency table this example generates.
Results of Using the FREQ Procedure
                           Data Described by VLIB.INV                          1
                                    COUNTRY

                                                 Cumulative  Cumulative
         COUNTRY            Frequency   Percent   Frequency    Percent
         --------------------------------------------------------------
         Argentina                 2      11.8           2       11.8
         Australia                 1       5.9           3       17.6
         Brazil                    4      23.5           7       41.2
         USA                      10      58.8          17      100.0
For more information about the FREQ procedure, see the Base SAS Procedures Guide.

Using the MEANS Procedure

Suppose you want to determine some statistics for each of your USA customers. The view descriptor Vlib.UsaORdr accesses records from the Order table that have a SHIPTO value beginning with a 1, indicating a USA customer.
The following example generates the mean and sum of the length of material ordered and the fabric charges for each USA customer. Also included are the number of rows (N) and the number of missing values (NMISS).
proc means data=vlib.usaordr mean sum n nmiss maxdec=0;
   by shipto;
   var length fabricch;
   title 'Data Described by VLIB.USAORDR';
run;
The BY statement causes the interface view engine to generate ordering criteria so that the data is sorted. The following output shows some of the information produced by this example.
Results of Using the MEANS Procedure
                         Data Described by VLIB.USAORDR                        1
-------------------------------- SHIPTO=14324742 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          4      0          1095          4380
         FABRICCH  FABRICCHARGES   2      2       1934460       3868920
         --------------------------------------------------------------

-------------------------------- SHIPTO=14898029 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          2      0          2500          5000
         FABRICCH  FABRICCHARGES   2      0       1400825       2801650
         --------------------------------------------------------------

-------------------------------- SHIPTO=15432147 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          4      0           725          2900
         FABRICCH  FABRICCHARGES   2      2        252149        504297
         --------------------------------------------------------------

-------------------------------- SHIPTO=18543489 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          6      0           303          1820
         FABRICCH  FABRICCHARGES   4      2      11063836      44255344
         --------------------------------------------------------------

-------------------------------- SHIPTO=19783482 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          4      0           450          1800
         FABRICCH  FABRICCHARGES   4      0        252149       1008594
         --------------------------------------------------------------

-------------------------------- SHIPTO=19876078 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          2      0           690          1380
         FABRICCH  FABRICCHARGES   0      2             .             .
         --------------------------------------------------------------
For more information about the MEANS procedure, see the Base SAS Procedures Guide.

Using the RANK Procedure

You can use advanced statistics procedures with CA-Datacom/DB data that is described by a view descriptor. The following example uses the RANK procedure with data that is described by the view descriptor Vlib.Emps to calculate the order of birthdays for a set of employees. This example creates a SAS data file MyData.RankEx from the view descriptor Vlib.Emps. It assigns the column name DATERANK to the new field created by the procedure. (The Vlib.Emps view descriptor includes a WHERE clause to select only the employees whose job code is 602.)
proc rank data=vlib.emps out=vlib.rankexam;
   var birthdat;
   ranks daterank;
run;
proc print data=vlib.rankexam;
   title 'Order of Employee Birthdays';
run;
Vlib.Emps is based on the CA-Datacom/DB table Employees. The following output shows the result of this example.
Results of Using the RANK Procedure
                          Order of Employee Birthdays                          1
     OBS     EMPID    JOBCODE    BIRTHDAT    LASTNAME              DATERANK

       1    456910       602     24SEP53     ARDIS                     5
       2    237642       602     13MAR54     BATTERSBY                 6
       3    239185       602     28AUG59     DOS REMEDIOS              7
       4    321783       602     03JUN35     GONZALES                  2
       5    120591       602     12FEB46     HAMMERSTEIN               4
       6    135673       602     21MAR61     HEMESLY                   8
       7    456921       602     12MAY62     KRAUSE                    9
       8    457232       602     15OCT63     LOVELL                   11
       9    423286       602     31OCT64     MIFUNE                   12
      10    216382       602     24JUL63     PURINTON                 10
      11    234967       602     21DEC67     SMITH                    13
      12    212916       602     29MAY28     WACHBERGER                1
      13    119012       602     05JAN46     WOLF-PROVENZA             3
For more information about the RANK procedure and other advanced statistics procedures, see the Base SAS Procedures Guide.