The TOPK statement calculates and selects the top-k and bottom-k distinct values of a variable based on a user-specified ranking order. The distinct values can be reported as raw or formatted values. The ranking can be based on the raw value, the formatted value, the frequency count, or based on a calculated score derived from the values of a weight variable. You can also specify aggregate functions to roll up multiple weight values into a single score for a distinct value.
specifies one or more numeric variables. If you do not specify this option, then all numeric variables in the table are used.
specifies the aggregation methods for which WEIGHT= variable values are rolled up into rank order score for distinct values. If no WEIGHT= variable is specified, then this option is ignored.
MAX | specifies to use the maximum value of the weight values |
MEAN | specifies to use the arithmetic mean of the weight values |
MIN | specifies to use the minimum value of the weight values |
SUM | specifies to use the sum of the weight values |
Alias | AGG= |
Default | SUM |
specifies the formats for the variables. If you do not specify the FORMATS= option, or if you omit the entry for a variable, the default format is applied for that variable.
Example | proc imstat data=lasr1.table1;
topk x1 x2 / formats=("10.2", 10.2");
quit;
|
specifies the numeric frequency variable to use for calculating the rank order score for distinct values. This option is valid when ORDER=FREQ or when AGGREGATE= is N, SUM, or MEAN only.
specifies the maximum number of distinct values to include in the top-k list.
Alias | TOPK= |
Default | 1 |
Range | 1 to 1000 |
specifies the maximum number of distinct values to include in the bottom-k list.
Alias | BOTTOMK= |
Default | 1 |
Range | 1 to 1000 |
specifies that the levels of the GROUPBY variables are to be arranged in descending order.
Alias | DESC |
specifies the rank ordering to apply to the distinct values when no WEIGHT= variable is specified. The following rank orders are valid in the TOPK request.
FREQ | specifies to order by frequency count |
VALUE | specifies to order by raw or formatted values of the variable |
WEIGHT | specifies to order by the aggregate values of the WEIGHT= variable |
Default | FREQ |
specifies the numeric weight variable to use for calculating the rank order score. If you specify ORDER= and WEIGHT=, then the WEIGHT= variable takes priority over ORDER.
saves the result table so that you can use it in other IMSTAT procedure statements like STORE, REPLAY, and FREE. The value for table-name must be unique within the scope of the procedure execution. The name of a table that has been freed with the FREE statement can be used again in subsequent SAVE= options.
specifies either a quoted string that contains the SAS expression that defines the temporary variables or a file reference to an external file with the SAS statements.
Alias | TE= |
specifies the list of temporary variables for the request. Each temporary variable must be defined through SAS statements that you supply with the TEMPEXPRESS= option.
Alias | TN= |
ODS Table Name
|
Description
|
Option
|
---|---|---|
TOPK
|
Top/Bottom K Distinct
Values
|
Default
|
BTMK
|
Top/Bottom K Distinct
Values
|
Default
|
TOPKMISC
|
Misc. Info for Top/Bottom
K Distinct Values
|
Default
|