Split Columns Task

About the Split Columns Task

The Split Columns task creates an output data set by splitting the unique combination of values of the selected columns in the input data set into multiple columns.
This task is useful when you have a data set in which one column contains multiple observations for different subgroups and you want to split the subgroup measures into separate columns. For example, you could split a column that contains the monthly temperature readings for various locations across a geographic region. The output data set would contain the monthly temperature readings for each location in a column for each month.

Example: Splitting the Height Column in the CLASSFIT Data Set

  1. In the Tasks section, expand the Data folder and double-click Stack/Split Columns. The user interface for the Stack/Split Columns task opens.
  2. On the Data tab, select SASHELP.CLASSFIT as the input data set.
    Here are the Name, Sex, Age, Height, and Weight columns in the Sashelp.Classfit data set.
    Subset of Sashelp.Classfit Data Set
  3. From the Method drop-down list, select Split a column.
  4. Assign columns to these roles:
    Role
    Column Name
    Column to split
    Age
    Case Identifier
    Height
    Level Identifier
    Sex
  5. Click the Output tab.
    • Clear the Use prefix check box.
    • Select the Select a variable that contains the names for the new variables check box, and assign Sex to the New column names role.
  6. To run the task, click Submit SAS Code.
The resulting output data set contains three columns: Height, F, and M. From this output, you can see that one 11-year-old female has a height of 51.3 inches. No males are 51.3 inches. A 15-year-old female and a 15-year-old male are both 66.5 inches.
Resulting Work.Split Data Set

Split a Column

To split a column:
  1. After you select your input data source, select Split a column from the Method drop-down list.
  2. Assign variables to these roles:
    Role
    Description
    Roles
    Column to split
    specifies the variable that contains the values that you want to split into multiple columns.
    Case identifier
    identifies the values that belong to a particular case.
    Level identifier
    identifies the levels of the column to split. Each new variable contains the values of one level of the level identifier.
    Additional Roles
    Group analysis by
    specifies the variable to use to form BY groups.
  3. On the Output tab, set these options:
    Option Name
    Description
    Construct New Variable Names
    Use prefix
    You can specify a prefix to use in constructing the names for the transposed variables in the output data set. When you use a prefix, the variable name begins with the prefix value and is followed by the number 1, 2, and so on. To create a variable name with the prefix and the value of the selected variable, select Select a variable that contains the names for the new variables.
    Select a variable that contains the names of the new variables
    The variable that you assign to the New column names role is used to name the new columns in the output data set.
    Show Output Data
    Show output data
    specifies whether to include the output data in the results that appear on the Results tab. You can include all or a subset of the output data. The task always creates the output data set that appears on the Output Data tab. This data set is also saved to the specified location.