Rank Data Task

About the Rank Data Task

The Rank Data task computes ranks for one or more numeric variables across the rows in a table and includes the ranks in an output table.
For example, you might want to rank the sales for each product that your company sells. In this case, the ranking variable would show the order of product sales. The product with the highest number of sales would be ranked first.

Example: Ranking Students by Age and Height

In this example, you want to rank the students in your class by age and height.
To create this example:
  1. In the Tasks section, expand the Data folder and double-click Rank Data. The user interface for the Rank Data task opens.
  2. On the Data tab, select the SASHELP.CLASS data set.
  3. Assign columns to these roles:
    Role
    Column Name
    Columns to rank
    Height
    Rank by
    Age
  4. To run the task, click Submit SAS code.
The Rank Data task creates an output data set. In SAS Studio, this data set opens on the WORK.Rank tab. This data set contains the additional rank_Height column, which shows where that student ranks within her age group. For example, in the 11-year-old age group, Joyce is ranked number one. In the 12-year-old age group, Louise is ranked number 1.
Output Data Set Created by the Rank Data Task

Assigning Data to Roles

To run the Rank Data task, you must assign a column to the Columns to rank role.
Role
Description
Columns to rank
Each column that is assigned to this role is ranked. You must assign at least one variable to this role. By default, the rankings column is given the name rank_column-name, where column-name is the name of the original column.
Rank by
When you assign one or more columns to this role, the input table is sorted by the selected column or columns and rankings are calculated within each group.

Setting Options

You must select at least one output option.
Option Name
Description
Options
Ranking method
specifies the method to use when ranking the data. Here are the valid values:
None
does not use a method to rank the data.
Percentile ranks
partitions the original values into 100 groups, in which the smallest values receive a percentile value of 0 and the largest values receive a percentile value of 99.
Deciles
partitions the original values into 10 groups, in which the smallest values receive a decile value of 0 and the largest values receive a decile value of 9.
Ranking method (continued)
Quartiles
partitions the original values into four groups, in which the smallest values receive a quartile value of 0 and the largest values receive a quartile value of 3.
Group = n (NTILES)
partitions the original values into n groups, in which the smallest values receive a value of 0 and the largest values receive a value of n–1. Specify the value of n in the Number of groups box.
Fractional ranks with denominator = n
computes fractional ranks by dividing each rank by the number of observations that have nonmissing values of the ranking variable.
Fractional ranks with denominator = n+1
computes fractional ranks by dividing each rank by the denominator n+1, where n is the number of observations that have nonmissing values of the ranking variable.
Percents
divides each rank by the number of observations that have nonmissing values of the variable and multiplies the result by 100 to get a percentage.
Ranking method (continued)
Normal scores (Blom formula), Normal scores (Tukey formula), Normal scores (van der Waerden formula)
computes normal scores from the ranks. The resulting variables appear normally distributed. Here are the formulas:
Blom formula
y sub i , equals , cap phi super negative 1 end super , open . fraction open , r sub i , minus ,  3 eighths , close , over open n plus ,  1 fourth , close end fraction . close
Tukey formula
y sub i , equals , cap phi super negative 1 end super , open . fraction open , r sub i , minus ,  1 third , close , over open n plus ,  1 third , close end fraction . close
van der Waerden
y sub i , equals , cap phi super negative 1 end super , open . fraction r sub i , over open n plus 1 close end fraction . close
In these formulas, cap phi super negative 1 end super  is the inverse cumulative normal (PROBIT) function, ri is the rank of the ith observation, and n is the number of nonmissing observations for the ranking variable.
Note: If you set the If values tie, use option, the Rank Data task computes the normal score from the ranks based on non-tied values and applies the ties specification to the resulting score.
Savage scores (exponential)
computes Savage (or exponential) scores from the ranks.
Note: If you set the If values tie, use option, the Rank Data task computes the Savage score from the ranks based on non-tied values and applies the ties specification to the resulting score.
If values tie, use:
specifies how to compute normal scores or ranks for tied data values.
Mean (Midrank)
assigns the mean of the corresponding rank or normal scores
High rank
assigns the largest of the corresponding ranks or normal scores
Low rank
assigns the smallest of the corresponding ranks or normal scores
Dense rank
computes scores and ranks by treating tied values as a single-order statistic. For the default method, ranks are consecutive integers that begin with the number one and end with the number of unique, nonmissing values of the variable that is being ranked. Tied values are assigned the same rank.
Rank order
specifies whether to list the values from smallest to largest or from largest to smallest.
Results
Location to save output data
specifies the location of the output table. By default, the table is saved in the temporary Work library.
Include ranked columns
specifies that the output table contains the original columns as well as the ranked columns. If you want to replace the original column with the ranked columns, deselect the Include ranking columns check box.
By default, the ranked column is given the name rank_column-name, where column-name is the name of the original column.