### Example: Define a Custom Transformation

This example illustrates how to define a custom transformation by using the Variable Transformation Wizard.

Note: This example is intended for SAS programmers who are comfortable writing DATA step statements.

Kimball and Mulekar (2004) analyze the intensification tendency of Atlantic cyclones. This example is based on their analysis and graphics.

In this example, you use the Variable Transformation Wizard to write DATA step statements that creates a character variable, `Tendency`, that encodes whether a storm is strengthening or weakening. The `Tendency` variable is computed by transforming a numeric variable for wind speed. For each observation of each storm, the `Tendency` variable has the value Intensifying when the wind speed is stronger than it was for the previous observation, Steady when the wind speed stays the same, and Weakening when the wind speed is less than it was for the previous observation.

To transform a variable with a DATA step:

1. Open the Hurricanes data set.

The wind speed is contained in the `wind_kts` variable. Note that the values of the `wind_kts` variable are rounded to the nearest 5 knots. The name of each storm is contained in the `name` variable.

The data are grouped according to storm name, so an algorithm for creating the `Tendency` variable is as follows.

```   For each named storm:

Compute the difference between the current wind speed and the
previous wind speed by using the DIF function in Base SAS software.

Specify a value for the tendency variable according to whether
the difference in wind speed is less than zero, exactly
zero, or greater than zero.
```

If you were to write a DATA step to create the `Tendency` variable in a data set, you might write statements like the following. The DATA step creates two new variables: a numeric variable called `dif_wind_kts` and a character variable of length 12 called `Tendency`. The BY statement is used to loop through the names of cyclones; the NOTSORTED option specifies that the `Name` variable in the input data set is not sorted in alphabetic order.

```   data WindTendency;
set Hurricanes;
by name notsorted;
length Tendency \$12;
dif_wind_kts = dif(wind_kts);
if first.name then do;
Tendency = "Intensifying";
dif_wind_kts = .;
end;
else do;
if dif_wind_kts < 0 then
Tendency = "Weakening";
else if dif_wind_kts > 0 then
Tendency = "Intensifying";
else
end;
run;
```

The `Tendency` variable is assigned to Intensifying for the first observation of each storm because the storm system was weaker six hours earlier. The `dif_wind_kts` variable is assigned a missing value for the first observation of each storm because the previous wind speed is unknown.

For subsequent storm observations, the `dif_wind_kts` variable is assigned the results of the DIF function, which computes the difference between the current and previous values of `wind_kts`.

Submitting this DATA step in the Variable Transformation Wizard is easy. No changes are required.

2. Select AnalysisVariable Transformation from the main menu.

3. Select Custom from the Family list on the left side of the page, as shown in Figure 32.21.

4. Click .

The wizard displays the page shown in Figure 32.22.

5. Type the DATA step into the Variable Transformation Wizard, as shown in Figure 32.23.

Figure 32.23: A Custom Transformation

6. Click .

SAS/IML Studio scans the contents of the window and determines that the `name` and `wind_kts` variables are needed by the DATA step. The input data set, `Hurricanes`, is created in the `WORK` library. The input data set contains the `name` and `wind_kts` variables.

Next, the DATA step executes on the SAS server. The DATA step creates the output data set, `WindTendency`, which contains the `dif_wind_kts` and `Tendency` variables. The `dif_wind_kts` and `Tendency` variables are copied from the output data set to the SAS/IML Studio data table.

7. Scroll the data table to the extreme right to see the newly created variables.

You can now investigate the relationship between the `Tendency` variable and other variables of interest.

8. Create a box plot of `latitude` versus `Tendency`.

The box plot in Figure 32.24 shows the distribution of latitudes for intensifying, steady, and weakening storms. Intensifying storms tend to occur at more southerly latitudes, whereas weakening storms tend to occur at more northerly latitudes.

Figure 32.24: Latitude Stratified by Intensification Tendency