IMSTAT Procedure (Analytics)

Example 4: Retrieving Distinct Value Counts and Grouping

Details

The DISTINCT statement calculates the count of unique raw values of variables. The following example shows how the DISTINCT statement is used with the GROUPBY= option to count the unique values within groups. The results are stored in a temporary table and then the FETCH statement is used to order the results and view them.

Program

libname example sasiola host="grid001.example.com" port=10010 tag='hps';

data example.cars;
    set sashelp.cars;
run;

proc imstat;
    table example.cars;
    distinct / groupby=(origin type) temptable; 1
run;

    table example.&_templast_;
    /* columninfo; */  2
    fetch origin type _Column_ _N_ _NMiss_ /
          orderby=(origin type _N_) desc=(_N_) format to=20;  3
quit;

Program Description

  1. The DISTINCT statement calculates the unique values of all variables in the table (numeric and character) and then groups them by the unique combinations of the Origin and Type columns. The results are stored in a temporary table.
  2. The results of the COLUMNINFO statement are not shown, but in practice, listing the columns in the temporary table is helpful to understand the column names that are created.
  3. The FETCH statement retrieves the first 20 rows from the temporary table. The results are sorted by ascending values of Origin and Type, and descending values of the distinct count, _N_.

Output

In the following output, the rows are sorted lexically on values of Origin and Type, and then by the column with the greatest number of distinct values.
DISTINCT Statement Results After Grouping and Sorting
DISTINCT statement results after grouping and sorting