Develop Expressions for Directives

Introduction

Many directives provide tasks that incorporate the results of user-written expressions. For example, in the Transform Data directive, the Filter task enables you to specify a user-written expression that excludes source rows from the target. Use this section to help you write your own expressions.

About Implicit Assignment

In all expressions in SAS Data Loader for Hadoop, the return value of the expression is always implicitly assigned to the currently processed row of the specified column. The expression does not use the usual format variable=[expression]. Instead, the value of the first clause in the expression is written into the specified target column.
If the expression contains only one clause, then the returned value is obvious, as shown in the following example:
UPCASE(customer_first_name);
If the expression contains more than one clause, then the first clause needs to be a placeholder value, as shown in the following example:
customer_first_name;
if(LENGTH(customer_last_name) > 10) then
    customer_first_name=UPCASE(customer_first_name);

About Column Names in EEL Expressions

When Spark is selected as the run-time target, user-written DataFlux Expression Engine Language expressions (EEL expressions) can be applied in the Manage Columns transformations. The Manage Columns transformations are available in the Cleanse Data directive and the Transform Data directive.
In the Manage Columns transformations, EEL expressions can modify data in existing target columns, or they can generate new data for new target columns. In both cases, all of the columns that are named in the EEL expression need to appear in the Selected columns list in the Manage Columns transformation. Column names from the Available columns list cannot be named in EEL expressions.
DataFlux EEL Expressions in the Manage Columns Transformation
A second important consideration is that any column that is named in an expression has to be listed in Selected columns above the column that contains the expression. This ordering ensures that all of the variables that appear in the expression are defined before they are referenced.
In the preceding example, a new column is positioned at the bottom of the Selected columns list, as the last column in the target table. In this position, the expression can reference any of the other columns.
The preceding example also shows an expression that modifies values in the cust_type column. In that position, the expression could not reference the gender column.
If you need to reorder your columns to accommodate your expressions, then you can create a second Managed Columns transformation. In that second transformation, you can move any column into any position, as needed to meet the requirements of the target table.