Frequently Asked Questions

This section introduces frequently asked questions along with exercises. These topics include examples that illustrate specific concepts related to the Expression Engine Language (EEL).

Testing and Evaluating

In order to test an expression before running a data job, you must create sample rows.

Exercise 1: How do I test an expression without using a table to create rows?

In the Expression Properties dialog box, select Generate rows when no parent is specified.
This creates sample empty rows in the Preview tab.
Note: If you do not select Generate rows when no parent is specified, and you do not have output specified in the post-processing step, no data is output.

Exercise 2: Is it possible to create test rows with content rather than empty rows?

This involves creating extra rows with the PUSHROW() function in the Pre-expression section.
Note: To use the PUSHROW() function, do not select Generate rows when no parent is specified.
Consider the code example below:
// Pre-Expression 
string name // the name of the person 
string address // the address of the person 
integer age // the age of the person 

// Content for the first row 
name="Bob" 
address="106 NorthWoods Village Dr" 
age=30

 // Create an extra row for the 
// fields defined above 
pushrow() 

// The content for the extra row 
name="Adam" 
address="100 RhineStone Circle" 
age=32 

// Create an extra row for the 
// fields defined above 
pushrow() 

// The content for extra row 
name="Mary" 
address="105 Liles Rd" 
age=28 

// Create an extra row for the 
// fields defined above 
pushrow()
The PUSHROW() function creates the rows.

Selecting Output Fields

Some fields are used for calculation or to contain intermediate values, but are not meaningful in the output. As you test or build scripts, you might need to exclude fields from the output.

Exercise: How do I exclude some fields in the expression from being listed in the output?

To accomplish this, use the keyword hidden before declaring a variable.
Consider the following example:
// Pre-Expression 
// This declares a string 
// type that will be hidden 
hidden string noDisplay 

// Expression 
// Assigns any value to the string type 
noDisplay='Hello World But Hidden' 
The noDisplay string field is not output to the Preview tab.
To verify this, remove the parameter hidden from the string noDisplay declaration. Observe that noDisplay is output.

Working with Subsets

When you work with large record sets in the Data Job Editor, it can be time-consuming to test new jobs. You can shorten this time when you build your expression and test your logic against a subset of large record sets.

Exercise 1: Apply Your Expression to a Subset of Your Data By Controlling the Number of Records Processed.

Consider the following example:
// Pre-Expression 

// We make this variable hidden so it is 
// not output to the screen 
hidden integer count 

count=0 
hidden integer subset_num 

// the size of the subnet 
subset_num=100 

// This function estimates and sets the # of 
// records that this step will report 
rowestimate(subset_num) 

// Expression 
if(count==subset_num) 
     seteof() 
else 
     count=count + 1
Keep track of the number of records output with the integer variable count. Once count matches the size of the subset, use the SETEOF() function to prevent any more rows from being created.
The exact syntax for SETEOF() function is:
boolean seteof(boolean) 
When SETEOF() is called, the node does not read any more rows from the parent node. If Generate rows when no parent is specified is checked, the node stops generating rows. Furthermore, if any rows have been pushed using PUSHROW(), they are discarded, and further calls to PUSHROW() have no effect. The exception to this is if SETEOF(true) is called. In this case, any pushed rows (whether pushed before or after the call to SETEOF()) are still returned to the node below. Notably, if further PUSHROW() calls occur after SETEOF(true) is called, these rows are returned as well. Also note that after SETEOF() is called, the post-group expression and the post expression are still executed.
The ROWESTIMATE() function is used by data jobs to estimate the number of records that will be output from this step.
If you remove the hidden parameter from the integer count declaration, integers 1–100 are output.
Another approach to solving this problem is shown in the following example:

Exercise 2: Apply Your Expression to a Subset of Your Code By Filtering Out Rows of Data.

By setting the return value to true or false, you can use this approach as a filter to select which rows go to the next step.
Note: If you always return false, you get no output and your expression enters an infinite loop.
// Pre-Expression 
integer counter 
counter=0 
integer subset_num 

subset_num=50 

// Expression 
if counter < subset_num 
     begin 
         counter=counter + 1 
     end 
else 
     return true 

Initializing and Declaring Variables

As an expression is evaluated, each row updates with the values of the fields in the expression. This can lead to re-initialization of certain variables in the expression. You might want to initialize a variable only once and then use its value for the rest of the expression script.

Exercise: How do I initialize a variable just one time and not with each iteration of a loop?

Declare the variable in the pre-expression step, and it is initialized only once before the expression process takes over.

Saving Expressions

The following exercise explains the steps required to save your expressions.

Exercise: How do I save my expressions?

You can save an expression without saving the entire data job. Click Save. Your expression is saved in an .exp text file format that you can load using Load.

Counting Records

The following exercises explain expressions used for record count.

Exercise 1: How do I count the number of records in a table using the EEL?

In this example, a connection is made to the Contacts table in the DataFlux sample database, and output to an HTML report. For more information about connecting to a data source and specifying data outputs, refer to the DataFlux Data Management Studio online Help.
Define an integer type in the pre-expression step that contains the count.
// Pre-Expression 
// Declare and initialize an integer 
// variable for the record count 
integer recordCount 
recordCount=0 

// Expression 
// Increment recordCount by one 
recordCount=recordCount+1 
The value of RECORDCOUNT increases in increments of one until the final count is reached. If you want to increase the count for only those values that do not have a null value, enter the following in the expression:
// Check if the value is null 
if(NOT isnull(`address`) ) then 
     recordCount=recordCount+1 
In this example, the value RECORDCOUNT is updated after each row iteration.
Note: Field names must be enclosed in grave accents (ASCII `) rather than apostrophes (ASCII ').

Exercise 2: How do I see the final count of the records in a table instead of seeing it incremented by one on every row?

Declare a count variable as hidden. In the post-expression step, assign the value of count to another field that you want to display in the output (FINALCOUNT). Using PUSHROW(), add an extra row to the output to display FINALCOUNT. Add the final row in the post-processing step, so that FINALCOUNT is assigned only after all of the rows are processed in the expression step.
Here is the EEL code:
// Preprocessing hidden integer count count=0 

// Expression 
if(NOT isnull(`address`) ) then 
     count=count+1 

// Post Processing 
// Create a variable that will contain 
// the final value and assign it a value 
integer finalCount 

finalCount=count 

// Add an extra row to the output 
pushrow() 
When you enter this code and then run the code, the last row should display the total number of records in the table that are not null.

Exercise 3: How do I get just one row in the end with the final count instead of browsing through a number of rows until I come to the last one?

A simple way to do this is to return false from the main expression. The only row that is output is the one that was created with PUSHROW().
Or you can devise a way to indicate that a row is being pushed. The final row displayed is an extra pushed row on top of the stack of rows that is displayed. Therefore, you can filter all the other rows from your view except the pushed row.
To indicate that a row is pushed on your expression step, select Pushed status field and enter a new name for the field.
Once you indicate with a Boolean field whether a row is pushed or not, add another expression step that filters rows that are not pushed:
// Preprocessing 
hidden integer count 
count=0

// Add a boolean field to indicate 
// if the row is pushed 
boolean pushed  

// Expression 
if(NOT isnull(`address`) ) then 
     count=count+1 

// Name the pushed status field "pushed" 
if (pushed) then  
     return true 
else
     return false 

// Post Processing 
integer finalCount 
finalCount=count  

pushrow()

Debugging and Printing Error Messages

The following exercise explains how to print error messages and find debugging information.

Exercise: Is there a way to print error messages or to get debugging information?

You can use the PRINT() function that is available to print messages. When previewing output, these messages print to the Log tab.
In a previous example of calculating the number of records in a table, in the end, you can output the final count to the statistics file. In the post-processing section, you would use the following.
// Post Processing

// Integer to have the final count
integer finalCount

finalCount=count

// Add one extra row for post processing
pushrow()

// Print result to file
print('The final value for count is: '& finalCount)

Creating Groups

Expressions provide the ability to organize content into groups. The EEL has built-in grouping functionality that contains this logic. Once data is grouped, you can use other functions to perform actions on the grouped data. The use of grouping in EEL is similar to the use of the Group By clause in SQL.

Exercise 1: Can EEL group my data and then count the number of times each different entry occurs?

Yes. For example, you can count the number of different states that contacts are coming from, using the contacts table from a DataFlux sample database.

Exercise 2: How can I count each state in the input so that "NC", "North Carolina", and "N Carolina" are grouped together?

A convenient way to accomplish this is to add an expression node or a standardization node in the Data Job Editor, where you can standardize all entries prior to grouping.
Building on the previous example, add a Standardization step:
  1. In Data Job Editor, click Quality.
  2. Double-click the Standardization node.
  3. In the Standardization Properties dialog box, select State and specify the State/Province (Abbreviation) definition. This creates a new field called STATE_Stnd.
  4. Click Additional Outputs and select All.
  5. Click OK.
  6. In the Standardization Properties dialog box, click OK.
  7. In the Expression Properties dialog box, click Grouping. Make sure that grouping is now by STATE_Stnd and not STATE.
  8. Click OK.
The STATECOUNT now increments by each standardized state name rather than by each permutation of state and province names.

Exercise 3: How do I group my data and find averages for each group?

To illustrate how this can be done, use sample data.
  1. Connect to a Data Source.
    1. Connect to the Purchase table in the DataFlux sample database.
    2. In the Data Source Properties dialog box, click Add All.
    3. Find the Field Name for ITEM AMOUNT. Change the Output Name to ITEM_AMOUNT.
  2. Sort the Data. Now that you have connected to the Purchase table, sort on the data field that you use for grouping. In this case, sort by DEPARTMENT.
    1. In the Data Job Editor, click Utilities.
    2. Double-click Data Sorting. This adds a Data Sorting node.
    3. In the Data Sorting Properties dialog box, select DEPARTMENT and set the Sort Order to Ascending.
    4. Click OK.
  3. Create Groups. To create groups out of the incoming data, add another Expression node to the job after the sorting step.
    1. In the Expression Properties dialog box, click Grouping. The following three tabs are displayed: Group Fields, Group Pre-Expression, and Group Post-Expression.
    2. On the Group Fields tab, select DEPARTMENT.
    3. On the Group Pre-Expression tab, declare the following fields, and then click OK:
      // Group Pre-Expression
      // This variable will contain the total
      // sales per department
      real total
      total=0
      
      // This variable will keep track of the
      // number of records for each department
      integer count
      count=0
      
      // This variable will contain the
      // running average total
      real average
      average=0
    4. On the Expression tab, update the variables with each upcoming new row, and then click OK:
      // Expression
      // increase the total sales
      total=total+ITEM_AMOUNT
      
      // increase the number of entries
      count=count+1
      
      // error checking that the count of entries is not 0
      if count !=0 then
           begin
                average=total/count
                average=round(average,2)
           end
When you preview the Expression node, you should see the following in the last four columns:
Department
Total
Count
Average
1
3791.7
1
3791.7
1
6025.4
2
3012.7
1
7294.5
3
2431.5
1
11155.2
4
2788.8
...
...
...
...

Retrieving and Converting Binary Data

EEL provides the ability to retrieve data in binary format. This section describes how to retrieve and convert binary data in big-endian or little-endian formats, as well as mainframe and packed data formats.

Exercise 1: How do I retrieve binary data in either big-endian or little-endian format?

To retrieve binary data, use the IB() function. It also determines the byte order based on your host or native system. The syntax is:
real = ib(string, format_str)
where:
  • string: The octet array containing binary data to convert.
  • format_str: The string containing the format of the data, expressed as w.d. The width (w) must be between 1 and 8, inclusive, with the default as 4. The optional decimal (d) must be between 0 and 10, inclusive.
The w.d formats and informats specify the width of the data in bytes. The optional decimal portion specifies an integer which represents the power of ten by which to divide (when reading) or multiply (when formatting) the data. For example:
//Expression
//File handler to open the binary file
file input_file
//The binary value to be retrieved
real value
//The number of bytes that were read
integer bytes_read
//4-byte string buffer
string(4) buffer
input_file.open("C:\binary_file", "r")
//This reads the 4 byte string buffer
bytes_read=input_file.readbytes(4, buffer)
//The width (4) specifies 4 bytes read
//The decimal (0) specifies that the data is not divided by any power of ten
value = ib(buffer,"4.0")

Exercise 2: How do I force my system to read big-endian data regardless of its endianness?

To force your system to read big-endian data, use the S370FIB() function. The syntax is:
real = s370fib(string, format_str)
where:
  • string: The octet array containing IBM mainframe binary data to convert.
  • format_str: The string containing the w.d format of the data.
Use this function just like the IB() function. This function always reads binary data in big-endian format. The S370FIB() function is incorporated for reading IBM mainframe binary data.

Exercise 3: How do I read little-endian data regardless of the endianness of my system?

Currently, there are no functions available for this purpose.

Exercise 4: How do I read IBM mainframe binary data?

To read IBM mainframe binary data, use the S370FIB() function, described in Exercise 2.

Exercise 5: How do I read binary data on other non-IBM mainframes?

Currently, there are no functions available for this purpose.

Exercise 6: Is there support for reading binary packed data on IBM mainframes?

To read binary packed data on IBM mainframes, use the function S370FPD(). The syntax is:
real = s370fpd(string, format_str)
where:
  • string: The octet array containing IBM mainframe packed decimal data to convert.
  • format_str: The string containing the w.d format of the data.
This function retrieves IBM mainframe-packed decimal values. The width (w) must be between 1 and 16, inclusive, with the default as 1. The optional decimal (d) must be between 0 and 10, inclusive. This function treats your data in big-endian format.

Exercise 7: How do I read non-IBM mainframe packed data?

To read non-IBM mainframe packed data, use the function PD(). The syntax is:
real = pd(string, format_str) 
where:
  • string: The octet array containing IBM mainframe binary data to convert.
  • format_str: The string containing the w.d format of the data.

Converting Binary Data to a Certain Format

Just as it is possible to retrieve data in a special binary format, it is also possible to format data to a special binary format.

Exercise 8: How do I format binary data to the native endianness of my system?

To format binary data, use the FORMATIB() function. The syntax is:
integer = formatib(real, format_str, string)
where:
  • real: The numeric to convert to a native endian binary value.
  • format_str: The string containing the w.d format of the data.
  • string: The octet array in which to place the formatted native endian binary data. returns:
  • integer: The byte length of formatted binary data.
This function produces native endian integer binary values. The width (w) must be between 1 and 8, inclusive, with the default as 4. For example:
//Expression
//The byte size of the buffer that contains the content
real format_size
//The real type number
real number
//The real number that is retrieved
real fib_format
number=10.125
//The buffer that contains the formatted data
string(4) buffer
format_size= formatib(number, "4.3", buffer
//4.3 is to specify 4 bytes to read the entire
//data and 3 to multiply it by 1000
//The reason to multiply it by a 1000 is to divide it later by 1000
//To restore it back to a real number
fib_format= ib(buffer, "4.3")
//Verify that the formatting worked
//Fib_format should be 10.125

Exercise 9: How do I change to other formats?

To change to other formats, use the following functions:
Non-IBM mainframe packed data
integer = formatpd(real, format_str, string)
where:
  • real: The numeric to convert to a native-packed decimal value.
  • format_str: The string containing the w.d format of the data.
  • string: The octet array in which to place the formatted native-packed decimal data.
returns:
  • integer: The byte length of formatted packed decimal data.
IBM mainframe binary data
integer = formats370fib(real, format_str, string)
where:
  • real: The numeric to convert to an IBM Mainframe binary value.
  • format_str: The string containing the w.d format of the data.
  • string: The octet array in which to place the formatted IBM mainframe binary data.
returns:
  • integer: The byte length of formatted binary data
IBM mainframe packed decimal data
integer = formats370fpd(real, format_str, string)
where:
  • real: The numeric to convert to an IBM mainframe-packed decimal value.
  • format_str: The string containing the w.d format of the data.
  • string: The octet array in which to place the formatted IBM mainframe-packed decimal data.
returns:
  • integer: The byte length of formatted packed decimal data.

Supporting COBOL

Using expressions, it is possible to read binary data in specified COBOL COMP, COMP-3, and COMP-5 data formats. The following examples demonstrate how to do this.

Exercise 1: How do I read native endian binary data for COBOL?

To read native endian binary data, use the PICCOMP() function. The syntax is:
real piccomp(string, format_str)
where:
  • string: The octet array containing COBOL formatted packed decimal data to convert.
  • format_str: The string containing the PIC 9 format of the data.
The PICCOMP() function determines the number of bytes (2, 4, or 8) to consume by comparing the sum of the 9s in the integer and fraction portions to fixed ranges. If the sum is less than 5, then 2 bytes are consumed. If the sum is greater than 4 and less than 10, then 4 bytes are consumed. If the sum is greater than 9 and less than 19, then 8 bytes are consumed. For example:
//Expression
//file handler to open files
File pd
integer rc
string(4) buffer
real comp
if (pd.open("binary_input.out", "r")) begin
rc = pd.readbytes(4, buffer)
if (4 == rc) then
comp = piccomp(buffer, "S9(8)")
pd.close()
end
In the preceding example, because of the format of the string is S9(8), 4 bytes were consumed. Notice that all of the COBOL data functions support a PIC designator of the long form:
[S][9+][V9+] (ex: S99999, 99999V99, S999999V99, SV99)
Or of the shortened count form:
[S][9(count)][V9(count)] (ex: S9(5), 9(5)v99, S9(6)v9(2), sv9(2))

Exercise 2: How do I read packed decimal numbers?

To read packed decimal numbers, use the PICCOMP3() function. The syntax is:
real piccomp3(string, format_str)
where:
  • string: The octet array containing COBOL-formatted packed decimal data to convert.
  • format_str: The string containing the PIC 9 format of the data.
The PICCOMP3() function determines the number of bytes to consume by taking the sum of the 9s in the integer and fraction portions and adding 1. If the new value is odd, 1 is added to make it even. The result is then divided by 2. As such, S9(7) would mean there are 4 bytes to consume. The packed data is always in big-endian form.
The PICCOMP3() function is used the same as the PICCOMP() function. For an example of the PICCOMP3() function, see Exercise 1.

Exercise 3: How do I read signed decimal numbers in COBOL format?

To read signed decimal numbers, use the PICSIGNDEC() function. The syntax is:
real picsigndec(string buffer, string format_str, boolean ebcdic, boolean trailing) 
where:
  • string buffer: The octet array containing a COBOL-formatted signed decimal number to convert.
  • string format_str: The string containing the PIC 9 format of the data. The default format_str is S9(4).
  • Boolean EBCDIC: The Boolean when set to nonzero indicates the string is EBCDIC. The default EBCDIC setting is false.
  • Boolean trailing: The Boolean when set to nonzero indicates the sign is trailing. The default trailing setting is true.
The PICSIGNDEC() function determines the number of bytes to consume by taking the sum of the 9s in the integer and fraction portions of format_str. For example:
//Expression
//file handler to open files
file pd
integer rc
string(6) buffer
real comp
if (pd.open("binary_input.out", "r")) begin
rc = pd.readbytes(6, buffer)
if (4 == rc) then
comp = picsigndec(buffer, "S9(4)V99",1,1) pd.close()
end

Formatting

It is also possible to format data to a specific COBOL format, as demonstrated by the following exercises:

Exercise 4: How do I format from a real to COBOL format?

To format from a real to a COBOL format, use the FORMATPICCOMP() function. The syntax is:
integer = formatpiccomp(Real number,string format_str, string result)
where:
  • real number: The numeric to convert to a COBOL native endian binary value.
  • string format_str: The string containing the PIC 9 format of the data.
  • string result: The octet array in which to place the COBOL-formatted native endian binary data.
returns:
  • integer: The byte length of formatted binary data.
The FORMATPICCOMP() function does the reverse of PICCOMP(). As with the PICSIGNDEC() function, the FORMATPICSIGNDEC() function determines the number of bytes to consume by taking the sum of the 9s in the integer and fraction portions. For example:
//Expression
real comp
comp = 10.125
integer rc
rc = formatpiccomp(comp, "s99V999", buffer)
//The string buffer will contain the real value comp formatted to
platform COBOL COMP native endian format. ??///

Exercise 5: What is the list of functions available for COBOL formatting?

The syntax for a COBOL-packed decimal value is:
integer = formatpiccomp3(Real number, string format_str, string result)
where:
  • real number: The numeric to convert to a COBOL packed decimal value.
  • string format_str: The string containing the PIC 9 format of the data.
  • string result: The octet array in which to place the COBOL formatted packed decimal data.
returns:
  • integer: The byte length of formatted packed decimal data.
The syntax for a COBOL-signed decimal value is:
integer = formatpicsigndec(real number, string format_str, string buffer, 
boolean ebcdic, boolean trailing)
where:
  • real number: The numeric to convert to a COBOL-signed decimal value.
  • string format_str: The string containing the PIC 9 format of the data.
  • string buffer: The octet array in which to place the COBOL-formatted packed decimal data.
  • Boolean EBCDIC: The Boolean when nonzero indicates to format in EBCDIC.
  • Boolean trailing: The Boolean when nonzero indicates to set the sign on the trailing byte.
returns:
  • integer: The byte length of the formatted signed decimal.
The COBOL-format functions are used the same as the FORMATPICCOMP() function. For an example of the COBOL-format functions, see Exercise 4.

Using Array Functions

This section contains additional information about arrays, including:

Creating an Array

The following exercise explains how to create an array.

Exercise: How do I create an array and provide values for the items in the array?

To declare an array, use the reserved key word array.
string array variable_name
integer array variable_name
boolean array variable_name
date array variable_name
real array variable_name
For example:
// declare an array of integer types
integer array integer_list

// set the size of the array to 5 integer_list.dim(5)

// the index that will go through the array
integer index
index=0

// Set the values of the items inside the
// array to their index number
for index=1 to 5
    begin
        integer_list.set(index, index);
    end

Retrieving Elements from an Array

This exercise demonstrates how to retrieve elements from an array.

Exercise: How do I retrieve elements from an array?

To retrieve elements from an array, use the following example; it builds on the previous example:
integer first
integer last

// Getting the first item from integer array
first=integer_list.get(1);
// Getting the last item from integer array
last=integer_list.get(5)

Changing an Array Size

The following exercise explains how to change the size of an array.

Exercise: How do I change the size of an array?

To change the size of an array, use the DIM() function. For example:
// array is originally initialized to 5
string array string_container
string_container.dim(5)
...
...
// the array is sized now to 10
string_container.dim(10)

Determining an Array's Size

The following exercise shows how to determine the size of an array.

Exercise: How do I determine the size of an array?

To determine the size of an array, use the DIM() function. Remember that the DIM() function is also used to set the size of an array. If no parameter is specified, the array size does not change.
For example:
// Expression
integer array_size string array array_lister
...
...

// after performing some operations on the array
// array_size will then contain
// the size of the array
array_size=array_lister.dim()

Finding Common Values between Columns Using Arrays

The next exercise shows how to find common values between columns.

Exercise: How do I find out if entries in one column occur in another column regardless of row position and number of times they occur?

One way to address this problem is to create two arrays for storing two columns. Then, check if the values in one array exist in the other array. Find those values that match and store them in a third array for output.
Create a Data Input node as Text File Input, and set the text file to C:\arrayTextDocument.txt in Data Jobs. Begin with the following text in the file:
c:\arrayTextDocument.txt
A ID
B ID
0
1
1
2
3
4
5
6
6
0
Create an Expression node, and declare the following variables in the pre-expression step:
// Pre-Expression
// This is where we declare and
// initialize our variables.
hidden string array column_A
hidden string array column_B
hidden string array column

hidden integer column_A_size
column_A_size=1
column_A.dim(column_A_size)

hidden integer column_B_size
column_B_size=1
column_B.dim(column_B_size)

hidden integer commun_size
commun_size=1
commun.dim(commun_size)
All the variables are hidden and are not displayed on the output. All the arrays are defined in the beginning to be of size 1. Later, these arrays will be expanded to accommodate the number of rows that are added.
// Expression

// Name your First_Column field as you need
column_A.set(column_A_size, `A_ID`)

column_A_size=column_A_size+1
column_A.dim(column_A_size)

// Name the Second_Column field as you need
column_B.set(column_B_size, `B_ID`)

column_B_size=column_B_size+1
column_B.dim(column_B_size)
In this step, we retrieve input into the arrays and expand the size of the arrays as necessary. The size of the array might become quite large depending on the size of the column, so it is recommended you use this example with small tables.
// Post Expression
// This is the step where most of the
// logic will be implemented

// index to iterate through column_A 
hidden integer index_column_A

// index to iterate through column_B
hidden integer index_column_B

// index to iterate through commun array
hidden integer index_commun

// index to display the commun values that were found
hidden integer commun_display_index

// string that will contain the items
// from column A when retrieving hidden string a

// string that will contain the items
// from column B when retrieving hidden string b

// String that will contain the contents of the
// commun array when retrieving
hidden string commun_content

// This boolean variable
// is to check if a commun entry has already
// been found. If so, don't display it again
hidden boolean commun_found

// This is the variable
// that will display the common entries in the end
string commun_display

// Retrieves the entries in column A
for index_column_A=1 to column_A_size Step 1
    begin
        a=column_A.get(index_column_A)
        for index_column_B=1 to column_B_size Step 1
            begin

                b=column_B.get(index_column_B)

                // Compare the entries from column A with
                // the entries from column B
                if(compare(a,b)==0)
                     begin
                // Check if this entry was already found once
                commun_found=false
                   for index_commun=1 to commun_size Step 1
                       begin
                            commun_content=commun.get(index_commun)
                            if(compare(commun_content,a)==0) then
                                 commun_found=true
                       end

                   // It is a new entry. Add it to the
                   // commun array and increment its size
                   if(commun_found==false)
                        begin
                             commun.set(commun_size,a)
                             commun_size=commun_size+1
                             commun.dim(commun_size)
                        end
                end
end
    end

// Display the contents of the commun array
// to the screen output
for commun_display_index=1 to commun_size Step 1
     begin
          pushrow()
          commun_display=commun.get(commun_display_index)
     end
If you want to see the output limited to the common values, add another Expression node and the following filtering code:
// Expression
if(isnull(`commun_display`)) then
     return false
else
     return true

Using Blue Fusion Functions

Once a Blue Fusion object is defined and initialized, the function methods listed can be used within the Expression node. The following exercises demonstrate how the Blue Fusion object methods can be used in the Expression node.

Exercise 1: How do I start a Blue Fusion instance and load a QKB?

To start a Blue Fusion instance and load a QKB, add the following in the Pre-Processing tab:
// Pre-processing

// defines a bluefusion object called bf
bluefusion bf;

// initializes the bluefusion object bf
bf = bluefusion_initialize()

// loads the English USA Locale
bf.loadqkb("ENUSA");
To load other QKBs, refer to their abbreviation. Go to the DataFlux Data Management Studio Administration riser bar and click Quality Knowledge Base to see which QKBs are available for your system.

Exercise 2: How do I create match codes?

To create match codes, after you initialize the Blue Fusion object with a QKB in the Pre-Processing tab, enter the following expressions:
// Expression

// define mc as the return string that contains the match code
string mc

// define the return code ret as an integer
integer ret

// define a string to hold any error message that is returned,
string error_message
// generate a match code for the string Washington D.C.,
// using the City definition at a sensitivity of 85, and
// put the result in mc
ret = bf.match code("city", 85, "Washington DC", mc);

// if an error occurs, display it; otherwise return a success message
if ret == 0 then
  error_message = bf.getlasterror()
else
  error_message = 'Successful'

Exercise 3: How do I use Blue Fusion standardize?

To use Blue Fusion standardize, enter the following expressions after you initialize the Blue Fusion object in the Pre-Processing tab:
// Expression

// define stdn as the return string that contains the standardization string stdn

// define the return code ret as an integer integer ret

// define a string to hold any error message that is returned string error_message

// standardize the phone number 9195550673,
// and put the result in stnd
ret = bf.standardize("phone", "9195550673", stdn);

//if an error occurs display it; otherwise return a success message, if ret == 0 then
  error_message = bf.getlasterror()
else
  error_message = 'Successful'

Exercise 4: How do I use Blue Fusion identify?

To use Blue Fusion identity, after you initialize the Blue Fusion object in the Pre-Processing tab, enter the following expressions:
// Expression

// define iden as the return string that contains the identification
string iden

// define the return code ret as an integer
integer ret

// define a string to hold any error message that is returned
string error_message
// generate an Ind/Org identification for IBM and put
// the result in iden
ret = bf.identify("Individual/Organization", "IBM", iden);

//if an error occurs display it; otherwise return a success message,
if ret == 0 then
  error_message = bf.getlasterror()
else
  error_message = 'Successful'

Exercise 5: How can I perform gender analysis?

To perform gender analysis, after you initialize the Blue Fusion object in the Pre-Processing tab, enter the following expressions:
// Expression

// define gend as the return string that contains the gender
string gend

// define the return code ret as an integer
integer ret

// define a string to hold any error message that is returned
string error_message
// generate a gender identification for Michael Smith,
// and put the result in gend
ret = bf.gender("name","Michael Smith",gend);

// if an error occurs display it; otherwise return a success message,
if ret == 0 then
  error_message = bf.getlasterror()
else
  error_message = 'Successful'

Exercise 6: How can I do string casing?

To perform string casing after you initialize the Blue Fusion object in the Pre-Processing tab, enter the following expressions:
// Expression

// define case as the return string that contains the case
string case

// define the return code ret as an integer integer ret

// define a string to hold any error message that is returned
string error_message

// convert the upper case NEW YORK to propercase
ret = bf.case("Proper", 3, "NEW YORK",case);

// if an error occurs display it; otherwise return a success message,
if ret == 0 then
  error_message = bf.getlasterror()
else 
  error_message = 'Successful'

Exercise 7: How can I do pattern analysis?

To perform pattern analysis after you initialize the Blue Fusion object in the Pre-Processing tab, enter the following expressions:
//Expression 

//define pattern as the return string 
string pattern 

//define the return code ret as an integer 
integer ret 

// define a string to hold any error message that is returned 
string error_message 

// analyze the pattern 919-447-3000 and output the result 
// as pattern 
ret = bf.pattern("character", "919-447-3000", pattern); 

// if an error occurs display it; otherwise return a success message, 
if ret == 0 then 
  error_message = bf.getlasterror() 
else 
  error_message = 'Successful' 

Using Date and Time Functions

In this section, you will find additional information about date and time functions, including:

Finding Today's Date

The following exercise describes out to find the values for today’s date.

Exercise: How do I find the year, month, and day values for today's date?

To determine the parts of the current date, use the TODAY() function.
date today()
The following function returns the current date and time:
// Expression 
date localtime 
localtime=today() 

Formatting a Date

The following exercises show now to format dates.

Exercise 1: What formats can a date have?

Dates should be in the format specified by ISO 8601 (YYYY-MM-DD hh:mm:ss) to avoid ambiguity. Remember that date types must start with and end with the # sign. For example:
Date only:
// Expression date dt 
dt=#2007-01-10#  
//Jan 10 2007 
Date with time:
// Expression date dt 
dt=#2007-01-10 12:27:00#  
//Jan 10 2007 at 12:27:00 

Exercise 2: How do I format the date?

To specify a format for the date in EEL, use the FORMATDATE() function:
string formatdate(date, string) 
The FORMATDATE() function returns a date formatted as a string. For example:
// Expression 
// all have the same output until formatted explicitly 
date dt  
dt=#2007-01-13# 

string formata 
string formatb 
string formatc 
formata=formatdate(dt, "MM/DD/YY") // outputs 01/13/07 
formatb=formatdate(dt, "DD MMMM YYYY") // outputs 13 January 2007 
formatc=formatdate(dt, "MMM DD YYYY") // outputs Jan 13 2007 

Extracting Parts from a Date

To extract parts of a date, use the following exercise.

Exercise: How do I get individual components out of a date?

To extract parts of a date, use the FORMATDATE() function. For example:
// Expression 
date dt 
dt=#10 January 2003# 

string year 
string month 
string day 

// year should be 03 
year=formatdate(dt, "YY") 
// month should be January 
month=formatdate(dt, "MMMM") 
// day should be 10 
day=formatdate(dt, "DD") 
Note that if the date format is ambiguous, EEL will parse the date as MDY.

Adding or Subtracting from a Date

The following exercise explains how to add or subtract days from an existing date.

Exercise: Can I do arithmetic with dates?

EEL offers the ability to add or subtract days from an existing date. For example:
// Expression 
date dt // variable that will contain the date 
dt=#10 January 2003# 
date later 
date earlier 

// add three days to the original date 
later=dt+3 
// subtract three days from the original date 
earlier=dt-3 

Comparing Dates

To compare dates, use the FORMATDATE() function.

Exercise: How do I check if two dates match and are the same?

Convert the date to a string type using FORMATDATE() function and then check for the value of the string. For example:
date dt 

// the variable that will contain the date 
// that we want to compare against 
dt=#1/1/2007# 

// The string variable that will contain the 
// dt date in a string format 
string dt_string 

// The variable that will convert the 
// incoming date fields to string 
dt_string=formatdate(dt, "MM/DD/YY") 
string Date_string 

// Notice that `DATE` is the incoming field 
// from the data source It is written between `` so 
// it does not conflict with the date data type 
Date_string=formatdate(`DATE`, "MM/DD/YY") 

// boolean variable to check if the dates matched 
boolean date_match 

// Initialize the variable to false 
date_match=false 

if(compare(dt_string, Date_string)==0)then 
date_match=true 

Using Database Functions

This section explains using database functions with EEL.

Connecting to a Database

The following exercise explains how to connect to a database.

Exercise: How do I connect to a database?

To connect to a database, use the DBCONNECT() function. This function returns a dbconnection object. The syntax is:
dbconnection test_database
For example:
// Set connection object to desired data source
// Saved DataFlux connections can also be used
test_database=dbconnect("DSN=DataFlux Sample")

Listing Data Sources

Exercise 1: How do I return a list of data sources?

To return a list of data sources as a dbcursor, use the DBDATASOURCES() function.
The following example works with the Contacts table in the DataFlux sample database. Make sure you have some match codes in that table in a field called CONTACT_MATCHCODE. In the step before your expression step, use a match code generation node and have match codes created for some sample names in a text file. This text file is your job input step. Call this new field "Name_MatchCode." This example queries the Contacts table in the DataFlux sample database to see whether there are any names that match the names that you provided in your text file input.
Pre-processing window
// Declare Database Connection Object 
dbconnection db_obj 

// Declare Database Statement Object 
dbstatement db_stmt
// Set connection object to desired data source 
// Saved DataFlux connections can also be used 
db_obj=dbconnect("DSN=DataFlux Sample") 

// Prepare the SQL statement and define parameters 
// to be used for the database lookup 
db_stmt=db_obj.prepare("Select * from Contacts where Contact = ?") 
db_stmt.setparaminfo(0,"string",30)
Expression window
// Declare Database Cursor and define fields returned from table 
dbcursor db_curs 
string Database_ID 
string COMPANY 
string CONTACT 
string ADDRESS 

// Set parameter values and execute the statement 
db_stmt.setparameter(0,Name) 
db_curs=db_stmt.select() 

// Move through the result set adding rows to output 
while db_curs.next() 
begin 
     Database_ID=db_curs.valuestring(0) 
     COMPANY=db_curs.valuestring(1) 
     CONTACT=db_curs.valuestring(2) 
     ADDRESS=db_curs.valuestring(3) 
     pushrow() 
end 
db_curs.release() 

// Prevent the last row from occurring twice 
return false

Using Encode and Decode Functions

Exercise 1: How do I transcode a given expression string from its native encoding into the specified encoding?

To transcode an expression, use the encode function and decode function. For example:
//Expression 
string expression_string 
expression_string="Hello World" 
string decode_string 
string encode_string 
integer decode_return 
integer encode_return  
decode_return = decode("IBM1047", expression_string, decode_string) 
//Decode to IBM1047 EBCDIC 
encode_return = encode("IBM1047",decode_string,encode_string)  
//Encode string should be "Hello World" 

Exercise 2: What are the available encodings?

Refer to Appendix B: Encoding for a list of available encodings.

Using File Functions

File Operations

This section explains file operations in the EEL.

Exercise 1: How do I open a file?

To open a file in the EEL, use this expression:
// Expression 
File f 
open("C:\filename.txt","r")
The second parameter to the file object indicates the mode for opening the file (read, write, or read/write).

Exercise 2: How do I read lines from a file, treating each line as a single row from a data source?

After opening a file, use the following code to read a string line of input:
// Pre-Expression 
File f 
string input 
f open("C:\filename.txt", "rw")

// Expression 
input=f.readline() // Post Expression 
f close()
Make sure that you select Generate rows when no parent is specified. The file cursor advances one line in the text file for each row of input from the data source.

Exercise 3: How do I read lines from a text file, and create one output line for each line in the text file?

Write a WHILE loop that iterates through each line in the file with every row. For example, consider the following text files:
Example text files
Use the following expression:
// Expression 
File f 
File g 
string input 
input='hello' 
f open("C:\filename.txt")
g open("C:\filepet.txt")

while (NOT isnull(input)) 
     begin 
input=f.readline() 
print('The value of input is ' & input) 
input=g.readline() 
print('The value of input is ' & input) 
end  

seteof() 

// Post Expression 
f close()
This prints the contents of the two files to the log. If you preview the job, you see null for the input string since, the input string has a null value at the completion of the loop.
A good way to see how this example works in your job is to add an expression step that sets the end of file:
// Expression 
seteof() 
The preview pane shows the value of input as null, but the log pane shows each of the possible values listed in the filename.txt and filepet.txt files.

Exercise 4: How do I write to a file?

To write to a file, use the WRITELINE() function in the file object. For example:
// Expression File f 
f open("C:\filename.txt", "w")
f writeline("Hello World ")

// Post Expression
f close()
Note: This function overwrites the current contents of your text file.

Exercise 5: How do I move from one position to another in a file?

To move from one position in a file to another, there are three available functions: SEEKBEGIN(), SEEKCURRENT(), and SEEKEND().
The SEEKBEGIN() function sets the file pointer to a position starting at the beginning of the file. It returns true on success. Otherwise, it returns false. The parameter specifies the position:
seekbegin([position])
The SEEKCURRENT() function sets the file pointer to a position starting at the current position. It returns true on success. Otherwise, it returns false. The parameter specifies the number of bytes from the current position:
seekcurrent([position])
The SEEKEND() function sets the file pointer to a position starting at the end of the file. It returns true on success. Otherwise, it returns false. The parameter specifies the position from the end of the file:
seekend([position]) 
All of these functions receive as a parameter the number of bytes to move from the current position in the file. Specify 0 in the SEEKBEGIN() or the SEEKEND() functions to go directly to the beginning or the end of the file. For example: In order to append to the end of a file that you would select Generate rows when no parent is specified, and enter:
// Expression File f
f open("C:\Text_File\file_content.txt", "rw")
f seekend(0)
f writeline("This is the end ")
seteof()
This example adds the text "This is the end" to the end of the file. If you move to the beginning of the file, use the WRITELINE() function to overwrite existing content.
Close the file with F.CLOSE() in the post-processing step:
// Post Processing 
f close()

Exercise 6: How do I copy the contents of a file to another file?

To copy the contents of one file to another, use the Boolean function, COPYFILE(). This function takes the originating filename as the first parameter and the destination filename as the second parameter. The destination file can be created or amended by this function. For example:
// Expression 
string names 
string pets 
names="C:\filename.txt" pets="C:\filecopy.txt" 

copyfile(names, pets) 

seteof() 

Exercise 7: How do I read or write a certain number of bytes from a text file?

To read a specified number of bytes from a text file, use the READBYTES() function:
string input File a 
a.open("C:\filename.txt", "r")
a.readbytes(10, input)
To write a specified number of bytes to a text file, use the WRITEBYTES() function:
string input
input="This string is longer than it needs to be." File b
b.open("C:\filename.txt", "rw")
b.writebytes(10, input)
By overwriting existing data, this expression produces the following:
Example text file

Manipulating Files

Exercise 1: How do I retrieve information about the file?

To determine whether a file exists, use the FILEEXISTS() function:
boolean fileexists(string)
The FILEEXISTS() function returns true if the specified file exists. The string parameter is the path to the file.
To find the dates a file was created and modified, use the FILEDATE() function:
date filedate (string, boolean)
The FILEDATE() function returns the date on which a file was created. If the second parameter is true, it returns the modified date.
For example:
// Expression 
boolean file_test 
date created 
date modified  

file_test=fileexists("C:\filename.txt") 
created=filedate("C:\filename.txt", false) 
modified=filedate("C:\filename.txt", true) 

seteof() 
Note: If the FILEDATE() function returns a null value but the FILEEXISTS() function returns true, you most likely entered the file path incorrectly.
To get the size of a file, you can use the OPEN(), SEEKEND(), and POSITION() functions. The size of the file is returned in bytes. For example:
// Expression
File f
integer byte_size

f.open("C:\filename.txt", "rw")
f.seekend(0)

// The integer variable byte_size will have
// the size of the file in bytes
byte_size=f.position()

Exercise 2: Is it possible to perform operations such as renaming, copying, or deleting a file?

Yes. To delete a file, use the DELETEFILE() function:
boolean deletefile(string)
This action deletes a file from the disk. The string parameter is the path to the file.
Note: Use care when using this function. Once you delete a file, it is gone.
To move or rename a file, use the MOVEFILE() function:
boolean movefile(string, string)
For example, the following code moves filename.txt from the root to the Names folder.
boolean newLocation
newLocation=movefile("C:\filename.txt","C:\Names\filename.txt")
seteof()
Note: The directory structure must already be in place for the function to move the file to its new location.

Using Integer and Real Functions

Integers and real types are basic data types in EEL. An integer can be converted to a real type, and a real type value can be converted to an integer. This section focuses on available functions in EEL that work on integers and real types.

Determining Type

Determine the type of a variable before doing calculations.

Exercise: How do I determine whether the variable has a numeric value?

The ISNUMBER() built-in function can be used to determine whether a variable has a numeric value. It takes a variable as a parameter and returns true if the expression is a number. For example:
// Expression
string str
string input
input=8 // although a string, input is coerced into an integer

if(isnumber(`Input`))
     str="this is a number" // input is a number
else 
     str="this is a string"

Assigning Values

The following exercise provides information about assigning values.

Exercise: Can integers and real types have negative values?

Yes, integers and real types are not limited to positive values. Add a negative sign in front of the value to make it negative. For example:
// Expression
integer positive
integer negative
positive=1
negative=-1 // negative is equal to -1

Casting

The need to coerce from one type to another can happen frequently with integers, real data types, and string types. The user does not have to perform any task; EEL handles the casting automatically.

Exercise 1: Can I assign the value of a real data type to an integer? What about assigning an integer to a real data type?

Yes, integers and real types can be changed from one type to the other. To change the type, assign one type to the other.

Exercise 2: Is it possible to combine integers and real data types with strings?

Yes, a string type can be changed to an integer or a real type. For example:
integer x 

// string is converted to value 10 
// x will have the value 15 
x=5 + "10" 

Exercise 3: Is it possible to assign the integer value zero to a Boolean to represent false?

In EEL, Boolean values can have an integer value of zero, which is interpreted as false. Any nonzero integer value is interpreted as true.

Range and Precision

When working with scientific data with either very small or very large values, the range and precision of the integer and real types might be important.

Exercise: What is the range or precision for real and integer values?

Integer types are stored as 64-bit signed quantities with a range of –2 * 10^63 to 2
Real types are stored as high precision values with an approximate precision of 44 digits and a range of 5.0 * 10^-324 to 1.7 * 10^308. Real types are based on the IEEE 754 definition.

List of Operations

In EEL, the following operations can be performed on real and integer types.

Exercise: What operations can I do on real and integer types?

The list of operations for real and integer types includes:
  • Multiplication (*)
  • Division (/)
  • Modulo (%)
  • Addition (+)
  • Subtraction (–)
Currently, it is not possible to perform trigonometric or logarithmic calculations. You can perform exponential calculations using the POW() function:
real pow(real,real)
The POW() function returns a number raised to the power of another number.
// Expression
real exponential

// exponential is 8
exponential=pow(2,3)

Rounding

Integers and real values in EEL can be rounded using the round() function. The second parameter is an integer value that determines how many significant digits to use for the output. A positive value is used to round to the right of the decimal point. A negative value is used to the left of the decimal point.

Exercise: Can integer and real types be rounded?

Yes, by using the ROUND() function. Consider the following code example:
// Expressions
integer integer_value
integer_value=1274
real real_value
real_value=10.126

integer ten
integer hundred
integer thousand

// the value for ten will be 1270
ten=round(integer_value,-1)

// the value for hundred will be 1300
hundred=round(integer_value,-2)

// the value for thousand will be 1000
thousand=round(integer_value,-3)

real real_ten real
real_hundred

// the value for real_ten will be 10.1
real_ten= round(real_value, 1)

// the value for real_hundred will be 10.13
real_hundred=round(real_value, 2)

Using Regular Expression Functions

Using Regular Expressions

For a regular expression (regex) to work, you must first compile. In the Data Job Editor, this is best done in the pre-processing step. Here are some examples.

Exercise 1: How do I find matches within a string?

To find the first match in the string, use the FINDFIRST() function. To find subsequent matches in the string, use FINDNEXT(). For example:
regex r
r.compile("a.c")
if r.findfirst("abcdef")
     print("Found match starting at " & r.matchstart() & " length " &
r.matchlength())

Exercise 2: How do I know whether my regex pattern matches part of my input?

To see whether your regex pattern finds a match in the input string, follow this example:
regex a
boolean myresult

a.compile("a","ISO-8859-7")
myresult=a.findfirst("abc")

Exercise 3: How do I find the regex pattern I want to match?

To find the first instance of the regex pattern that you want to match, follow this example:
integer startingPosition regex r
r.compile("a.c")
if r.findfirst("abcdef")
     startingPosition=r.matchstart()

Exercise 4: How do I replace a string within my regex?

To replace a string, compile the regex and use the replace function as follows:
regex r
r.compile("xyz")
r.replace("abc","def")
This exercise replaces "abc" with "def" within the compiled "xyz."

Using String Functions

Determining Type

The following exercises demonstrate how to determine the data type of a string.

Exercise 1: How do I determine whether an entry is a string?

To determine whether the string is a string type, use the TYPEOF() function:
string typeof(any)
The TYPEOF() function returns the type of data that the expression converts to. For example:
// Expression 
string hello 
hello="hello" 

boolean error 
error=false 

// variable that will contain the type 
string type  
type=typeof(hello) 

// type should be string 
if(type<>"string") then 
    error=true 

Exercise 2: How do I determine whether a string consists of alphabetic characters?

To determine whether a string is made up entirely of alphabetic character, use the ISALPHA() function:
boolean isalpha(any)
The ISALPHA() function returns a value of true if the string is made up entirely of alphabetic characters. For example:
// Expression
string letters
letters="lmnop"
string mixed
mixed="1a2b3c"

string alphatype
alphatype=isalpha(letters) // true
string mixedtype 
mixedtype=isalpha(mixed) // false

Exercise 3: How can I retrieve all values that are either not equal to X or null values?

To retrieve the above stated values, use the ISNULL() function:
boolean isnull(any)
For example:
// Expression 
if State <> "NC" OR isnull(State) 
     return true 
else 
     return false 

Extracting Substrings

Exercise: How do I get substrings from an existing string?

To get substrings, there are three available functions: LEFT(), RIGHT(), and MID().
To return the leftmost characters of a string, use the LEFT() function:
string left(string, integer)
To return the rightmost characters of a string, use the RIGHT() function:
string right(string, integer)
For example:
// Expression 
string greeting 
greeting="Hello Josh and John" 

string hello 
string John 
string inbetween 

hello=left(greeting,5) // "Hello"
John=right(greeting,4) // "John"
inbetween=left(greeting, 10) // "Hello Josh"
inbetween=right(inbetween, 4) // "Josh"
Another approach is to use the MID() function:
string mid(string, integer p, integer n)
The MID() function returns a substring starting at position p for n characters. For example:
string substring
// substring will be the string "Josh"
substring=mid(greeting, 7, 4);

Parsing

Exercise: How do I parse an existing string into smaller strings?

To parse a sting, use the APARSE() function:
integer aparse(string, string, array)
The APARSE() function parses a string into a string array. The number of elements in the array is returned. For example:
// Expression 
string dataflux 
dataflux="Dataflux:dfPower:Architect" 

// An array type to contain the parsed words 
string array words 

// integer to count the number of words 
integer count 

// count will have a value of 3 
count=aparse(dataflux, ":", words) 

string first_word 
first_word=words.get(1) // First word will be "DataFlux" 

string second_word 
second_word=words.get(2) // Second word will be "Data Management"

string third_word 
third_word=words.get(3) // Third Word will be "Studio" 

string last_entry // This will have the last entry. 
last_entry=words.get(count)  
The APARSE() function is useful if you want to retrieve the last entry after a given separator.
Similar to the APARSE() function is the PARSE() function. The syntax for the PARSE() function is:
integer parse(string, string, ...) 
The PARSE() function parses a string using another string as a delimiter. The results are stored starting from the third parameter. It returns the total number of parameters.
You would use the PARSE() function in the following situation:
// Expression 
integer count 

string first 
string second 
string third 

// first contains "DataFlux" 
// second contains "Data Management" 
// third contains "Studio" 
count=parse("DataFlux:Data Management:Studio", ":", first, second, 
third); 
The main difference between the two functions is that APARSE() is suitable for arrays while PARSE() is useful for returning individual strings.

Converting ASCII Characters

EEL has the ability to convert characters to their ASCII values, and to convert ASCII values to characters.

Exercise: Is it possible to convert between ASCII characters and values?

Yes. To convert between ASCII characters and values, use the CHR() and ASC() functions. For example:
// Expression 
integer ascii_value 
string character_content 

ascii_value=asc("a"); // ascii_value is 97  
character_content=chr(97) // returns the letter "a" 
For a complete list of ASCII values, see Appendix A: ASCII Printable Characters.

Manipulating Strings

Frequently, when working with strings, you might want to perform manipulations such as adjusting the case, removing spaces, concatenating strings, and getting the length of a string. EEL has built-in functions to perform these actions.

Exercise 1: How do I concatenate strings?

To concatenate a string, use the "&" symbol. For example:
// Expression 
string Hello 
Hello="Hello "

string World 
World=" World"  

string Hello_World 
Hello_World=Hello & World // outputs "Hello World" 

Exercise 2: How do I get the length of a string and remove spaces?

To get the length of a string, use the LEN() function, and then to remove the spaces, use the TRIM() function.
The LEN() function returns the length of a string:
integer len(string) 
The TRIM() function returns the string with the leading and trailing whitespace removed:
string trim(string)
For example:
// Expression 
string content 
content="       spaces         "

integer content_length 

content=trim(content) // Remove spaces 

// returns 6 
content_length=len(content) 

Exercise 3: How do I convert a string type to lowercase or uppercase?

To convert a string to lowercase or uppercase, use the LOWER() and UPPER() functions.
The LOWER() function returns the string in lowercase:
string lower(string) 
The UPPER() function returns the string in uppercase:
string upper(string) 
This function returns the edit distance between two strings. Specifically, this function returns the number of corrections that would need to be applied to turn one string into the other.
The following examples use these functions:
// Expression 
integer difference 
integer comparison 

string hello 
hello="hello" 

string hey 
hey="hey" 

// comparison is -1 because hello comes before 
hey comparison = compare(hello, hey, true); 

// difference is 3 because there are three different letters 
difference = edit_distance(hello, hey); 

Exercise 2: How do I check if a string matches, or if it is a substring inside another string?

The following built-in EEL functions handle this situation.
The INSTR() function returns the location of one string within another string, stating the occurrence of the string.
boolean instr(string, string, integer) 
The MATCH_STRING() function determines whether the first string matches the second string, which might contain wildcards.
boolean match_string(string, string) 
Search strings can include wildcards in the leading (*ABC) and trailing (ABC*) position, or a combination of the two (*ABC*). Wildcards within a string are invalid (A*BC). Question marks can be used as a wildcard, but can be matched only to a character. For example, AB? matches ABC, not AB. To execute a search for a character that is used as a wildcard, precede the character with a backslash. This denotes that the character should be used literally and not as a wildcard. Valid search strings include: *BCD*, *B?D*, *BCDE, *BC?E, *BCD?, ABCD*, AB?D*, ?BCD*, *B??*, *B\?\\* (will match the literal string AB?\E). An invalid example is: AB*DE. For more complex searches, use regular expressions instead of the MATCH_STRING() function.
Consider the following code example with these functions:
// Expression 
string content 
content="Monday is sunny, Tuesday is rainy & Wednesday is windy" 

string search 
search="*Wednesday is windy" // note the * wildcard 

integer found_first 
integer found_next 

boolean match 

// Check if the search string is in the content match=match_string(content, search) 

if (match) then 
     begin 
          // Will find the first occurrence of day 
          found_first=instr(content, "day", 1) 

          // Will find the second occurrence of day 
          found_next=instr(content, "day", 2)  
       end 

Exercise 3: How do I know when the correct Surviving Record is selected as a survivor?

When comparing integers in the EEL, it is important to use the correct variable type. When using the variable type "string", a string value of "5" is actually greater than the string value of "10". If the values need to be compared as integers, the values must be converted into variables of integer type first. To accomplish this conversion use the TOINTEGER() function.

Replacing Strings

The REPLACE() function replaces the first occurrence of one string with another string and returns the string with the replacement made.
string replace(string, string, string, integer) 
If the fourth parameter is omitted or set to 0, all occurrences are replaced in the string. If the fourth parameter is set to another number, that many replacements are made.
Consider the following example:
// Expression 
string starter 
string replace 
string replaceWith 
string final  

starter="It's a first! This is the first time I came in first place!" 
replace="first" 
replaceWith="second"  

final =replace(starter, replace, replaceWith, 2) 

seteof() 
This example produces the following results:
starter
replace
replaceWith
final
It’s a first! This is the first time I came in first place!
first
second
It's a second! This is the second time I came in first place!

Finding Patterns

It is possible to extract patterns out of strings using EEL. EEL identifies the following as part of a string's pattern: 9 = numbers a = lowercase letters A = uppercase letters

Exercise: How do I get a string pattern?

To determine the string pattern, use the PATTERN() function:
string pattern(string) 
The PATTERN() function indicates whether a string has numbers or uppercase and lowercase characters. It generates a pattern from the input string. For example:
// Expression
string result;
string pattern_string;
pattern_string="abcdeABCDE98765";

// The result will be aaaaaAAAAA99999
result=pattern(pattern_string);

Identifying Control Characters

EEL can identify control characters such as a horizontal tab and line feed.

Exercise: How can I detect control characters in a string?

To detect control characters, use the HAS_CONTROL_CHARS() function.
boolean has_control_chars(string)
The HAS_CONTROL_CHARS() function determines whether the string contains control characters. For a list of control characters, see Appendix A: ASCII Control Characters.

Evaluating Strings

EEL enables you to dynamically select the value of a field.

Exercise: How can I convert field names into values?

To convert field names into values, use the VAREVAL() function.
string vareval(string)
The VAREVAL() function evaluates a string as if it were a variable.
Note: Since it has to look up the field name each time it is called, VAREVAL() is a slow function and should be used sparingly.
In the following example, you have incoming data from three fields: field1, field2, and field3, as shown in the following table.
C:\varevalExample.txt
field_1
field_2
field_3
field_4
field_5
1
Bob Brauer
123 Main St.
Cary
NC
2
Don Williams
4 Clover Ave.
Raleigh
NC
3
Mr. Jim Smith
44 E. Market Street
Wilmington
NC
4
Ms. Amber Jones
300 Chatham Dr.
Durham
NC
5
I Alden
99 A Dogwood Ave.
Apex
NC
You can write a for loop that builds the string ("field" and n), and uses the VAREVAL() function to get the value of the field. For example:
// Pre-expression
string field_number
string field_value

// Expression
hidden integer n
for n=1 to 5
begin
         field_number='field_' & n
         field_value=vareval(field_number)
         n=n+1
         pushrow()
end

// this next statement prevents the last row from showing up twice
return false