Prepare Data

To prepare data:
  1. Select a source table from the navigation pane, right-click, and select Prepare Data.
    If the table is a SAS data set, you can click Preview Data. The preview enables you to confirm that you have Read access to the data as well as to filter, sort, and so on, before you begin the data preparation.
  2. On the Source Columns page, deselect the check boxes for any columns that you do not want to include in the prepared data.
  3. On the Joined Tables page, click Add to select a table to join. For more information about how to use this page, see Joining Tables.
  4. On the Calculated Columns page, click Add to add new columns to the prepared data. For more information about how to use this page, see Adding Calculated Columns.
  5. On the Output Columns page, remove, reorder, or edit the output column information. You can specify the output column name, description, format, and length.
  6. On the Row Filters page, add filters to subset the input data. Click Add to add a new filter. Select the column name to filter on, select the filter criteria, and enter the filter value. If more than one filter is added, the filters are applied using AND logic.
  7. On the Sort Order page, select the column that you want to sort by. The default sort order is ascending. Use the menu to beside the selected column name to set the sort order to descending.
  8. On the Data Output page, set the following parameters:
    Parameter
    Sample Value
    Description
    Output table
    tablename
    The output table name is populated automatically with the source table name. Click Browse to select a different table or type the table name that you want to use.
    Location
    /SharedData
    Click Browse to specify a folder for the output table. The button becomes active when the Library value is changed from WORK to another library.
    Library
    WORK
    Click Browse to specify a library for the output table.
    Type of output
    Table
    If you select Table, then the size of the data is calculated and distributed evenly in HDFS with an optimal block size. If you select View, then the calculation for an optimal block size and even distribution is not possible.
    HDFS output path
    /user/
    Enter the fully qualified path in HDFS to use for storing the prepared data.
    HDFS filename
    tablename
    The filename for the table is automatically populated. It must match the name of the output table.
    Description
    Prepared data for tablename
    Specify a description to associate with the prepared data. . The description is displayed beside the table name in the explorer interface.
  9. Click Submit.