space
Previous Page | Next Page

SYSTEM 2000 Data in SAS Programs

Calculating Statistics

The statistical procedures FREQ, MEANS, and RANK can be used with SYSTEM 2000 data.

The following program uses PROC FREQ to calculate the percentage of employees that have each of the college degrees that exist in the database EMPLOYEE. This program uses the view descriptor VLIB.EMPEDUC. One-Way Frequency Table for Item DEGREE in View Descriptor VLIB.EMPEDUC shows the results.

    proc freq data=vlib.empeduc;
       tables degree;
       title2 'Data Described by VLIB.EMPEDUC';
    run;

One-Way Frequency Table for Item DEGREE in View Descriptor VLIB.EMPEDUC

                       Data Described by VLIB.EMPEDUC                   1
 
                             DEGREE/CERTIFICATE
 
                                          Cumulative  Cumulative
           DEGREE    Frequency   Percent   Frequency    Percent
           -----------------------------------------------------
           AA               5       7.9           5        7.9
           BA              12      19.0          17       27.0
           BS              23      36.5          40       63.5
           HIGH SC          6       9.5          46       73.0
           MA               3       4.8          49       77.8
           MBA              1       1.6          50       79.4
           MS               9      14.3          59       93.7
           PHD              4       6.3          63      100.0
 
                           Frequency Missing = 12

For more information about the FREQ procedure, see the Base SAS Procedures Guide.

In a further analysis of employee background, suppose you also want to create some statistics about skill types of the employees and their years of experience. The view descriptor VLIB.EMPSKIL accesses the values from the database EMPLOYEE, and the following program uses PROC MEANS to generate the mean and sum of the years of experience by skill type. The number of observations (N) and the number of missing values (NMISS) are also included.

Notice that the BY statement causes the interface view engine to generate a SYSTEM 2000 ordering-clause so that the data is sorted by skill type. Statistics for Skill Type and Years of Experience shows some of the results produced by this program.

    proc means data=vlib.empskil mean sum n nmiss
     maxdec=0;
       by skilltyp;
       var years;
       title2 'Data Described by VLIB.EMPSKIL';
    run;

Statistics for Skill Type and Years of Experience

                      Data Described by VLIB.EMPSKIL                      1
 
                   Analysis Variable : YEARS YEARS OF EXPERIENCE
 
 
--------------------------- SKILL TYPE=  -----------------------------------
 
 
                     N  Nmiss          Mean           Sum
                   --------------------------------------
                     0      6             .             .
                   --------------------------------------
 
 
--------------------------- SKILL TYPE=ACCOUNTING --------------------------
 
 
                     N  Nmiss          Mean           Sum
                   --------------------------------------
                     6      0             8            47
                   --------------------------------------
 
 
---------------------------- SKILL TYPE=ASSEMBLER --------------------------
 
 
                     N  Nmiss          Mean           Sum
                   --------------------------------------
                    14      0            10           141
                   --------------------------------------
 
--------------------------- SKILL TYPE=CARTOON ART -------------------------
 
 
                     N  Nmiss          Mean           Sum
                   --------------------------------------
                     1      0             1             1
                   --------------------------------------
 
 
----------------------------- SKILL TYPE=CHINESE ---------------------------
 
 
                     N  Nmiss          Mean           Sum
                   --------------------------------------
                     1      0             8             8
                   --------------------------------------
 
------------------------------ SKILL TYPE=COBOL ----------------------------
 
 
                     N  Nmiss          Mean           Sum
                   --------------------------------------
                    12      0             7            88
                   --------------------------------------

For more information about the MEANS procedure, see the Base SAS Procedures Guide.

You can also use more advanced statistics procedures with SYSTEM 2000 data. The following program uses PROC RANK with data described by the view descriptor VLIB.EMPBD to calculate the order of birthdays for a group of employees, and to assign the variable name DATERNK to the new item created by PROC RANK. (The VLIB.EMPBD view descriptor includes a SYSTEM 2000 where-clause to select only the employees in the Marketing Department.) Ranking of Employee Birthdays shows some of the results from this program.

   proc rank data=vlib.empbd out=mydata.rankexm;
      var birthday;
      ranks daternk;
   run; 
 
   proc print data=mydata.rankexm;
      title2 'Order of Marketing Employee Birthdays';
   run; 

Ranking of Employee Birthdays

                 Order of Marketing Employee Birthdays                     1
 
           OBS    LASTNAME      FIRSTNME       BIRTHDAY    DATERNK
 
             1    AMEER         DAVID          10OCT51       14.0
             2    BROOKS        RUBEN R.       25FEB52       15.0
             3    BROWN         VIRGINA P.     24MAY46        9.0
             4    CHAN          TAI            04JUL46       10.0
             5    GARRETT       OLAN M.        23JAN35        2.0
             6    GIBSON        GEORGE J.      23APR46        8.0
             7    GOODSON       ALAN F.        21JUN50       13.0
             8    JUAREZ        ARMANDO        28MAY47       11.0
             9    LITTLEJOHN    FANNIE         17MAY54       17.0
            10    RICHARDSON    TRAVIS Z.      30NOV37        4.0
            11    RODRIGUEZ     ROMUALDO R     09FEB29        1.0
            12    SCHOLL        MADISON A.     19MAR45        7.0
            13    SHROPSHIRE    LELAND G.      04SEP49       12.0
            14    SMITH         JERRY LEE      13SEP42        5.5
            15    VAN HOTTEN    GWENDOLYN      13SEP42        5.5
            16    WAGGONNER     MERRILEE D     27APR36        3.0
            17    WILLIAMSON    JANICE L.      19MAY52       16.0

For more information about the RANK procedure and other advanced statistics procedures, see the Base SAS Procedures Guide.

space
Previous Page | Next Page | Top of Page