Understanding Flow Patterns for Orchestration Jobs

Identify the Jobs to Be Orchestrated

A single orchestration job can run one or more jobs, such as SAS Data Integration Studio jobs, DataFlux Data Management Studio jobs, SAS code files, third-party programs, scripts, and web services. Identify the jobs that you want to combine into an orchestration job. Document the associated paths and other information that you need to access and run these files.

Patterns Overview

Some of the patterns covered in this document are supported, and other patterns are unsupported. The unsupported patterns are included so that you can avoid them.

Supported Patterns

Sequence Pattern

Sequence is a basic usage pattern. A process is executed. When that execution is complete, the second process executes. Then, the job is complete.
This pattern is shown in the following diagram:
Sequence Process Flow
Sequence Process Flow
Typically, you sequence processes when one process has a dependency on another. For example, one of the processes could consume data that the other produces. In the diagram, the lower process has a dependency on the upper process.

Decision Pattern

In the decision pattern, Process 1 executes. Then, a decision occurs (possibly related to the outcome of Process 1) and either Process 2 or Process 3 executes before the job completes.
This pattern is shown in the following diagram:
Decision Process Flow
Decision Process Flow

Repetition Pattern

In the repetition pattern, multiple processes are connected in a repetitive fashion. In the first repetition pattern, Process 1 executes, followed by Process 2. Execution then returns to Process 1 and continues indefinitely,
The first repetition pattern is shown in the following display:
Repetition Pattern One
Repetition Pattern One
In the second repetition pattern, Process 1 executes, followed by Process 2. After a decision is made, execution either returns to Process 1 or continues to Process 3, In this case, the job ends after Process 3 finishes.
The second repetition pattern is shown in the following display:
Repetition Pattern Two
Repetition Pattern Two

Multiple Dependencies Pattern

Processes that execute sequentially might have dependencies, as mentioned in Sequence Pattern. Process 3 depends on both Process 1 and Process 2. Therefore, it is sequenced after them. The dotted lines indicate a dependency. Process 1 and Process 2 need to be in a complete state before Process 3 can successfully execute. Ensure that these dependencies are correct so that you can handle looping situations when you cannot determine dependencies.
The following display shows the multiple dependencies pattern:
Multiple Dependencies Pattern
Multiple Dependencies Pattern

Iterate n Times Pattern

In this case, the iterator is set to iterate n times after Process 1 executes. Then, Process 2 is entered. After it completes, the iterator is incremented and checked against the threshold. If the threshold is not reached, Process 2 executes again. If the threshold is reached, Process 3 is entered.
The following display shows the iterate n times pattern:
Iterate n Times
Iterate n Times Pattern

Iterate over a Collection Pattern

In this pattern, Process 1 generates a set of data rows. The iterator depends on this data set. Therefore, it depends on Process 1. When the iterator is entered, it fetches the first row and stores its value as an output value of itself. Process 2 then executes. Typically, process 2 references this output value and uses it as a parameter for some action. When the process has finished, the iterator is entered again, and another row is fetched. If no more rows exist, Process 3 is entered.
The following display shows the iterate over a collection pattern:
Iterate over a Collection Pattern
Iterate Over a Collection Pattern

Embedded Iteration Pattern

This pattern is similar to the basic iteration pattern. After Process 2 completes, a new iterator is entered, which executes Process 3 n times. When complete, it exits and returns to the first iterator to continue. Finally, it executes Process 4 when that is complete.
The following display shows the embedded iteration pattern:
Embedded Iteration Pattern
Embedded Iteration Pattern

Multiple Paths to Same Target Pattern

In this pattern, either Process 2 or Process 3 is entered after a decision is made. Following that, Process 4 is always entered and the job completes.
The following display shows the multiple paths to the same target pattern:
Multiple Paths to Same Target Pattern
Multiple Paths to Same Target Pattern

Wait for Event Pattern

Process 1 and “wait for event” both start executing. (See Implicit Fork or Join Pattern). Wait for event does not exit until it receives an event, which could come from inside the job or through an external factor. When the event is received, it exits, and the Process 3 runs. The job is complete when both Process 2 and Process 3 are complete or a terminate event is received by the job.
The following display shows the wait for event pattern:
Wait for Event
Wait For Event

Wait for Event Indefinitely Pattern

In this pattern, a job waits for some event such as a file appearing in a directory. Then, the job executes Process 3, which might take some action on that file. After Process 3 runs, control loops back to wait for event. This sequence happens indefinitely until the job is terminated.
The following display shows the wait for event indefinitely pattern:
Wait for Event Indefinitely Pattern
Wait For Event Indefinitely Pattern

Fork and Join Pattern

In this pattern, Process 1 executes. Two execution contexts now exist at the fork, and Process 2 and Process 3 run simultaneously (as long as multiple threads are allocated for the job). When join is entered, it waits for both execution contexts to enter. Process 4 is then entered. Fork and join are implemented as a single node fork that has child nodes (Process 2 and Process 3). There is no join node in the implementation
The following display shows the fork and join pattern:
Fork and Join Pattern
Fork and Join Pattern

Fork Loop and Join Pattern

In this pattern, Process 1 executes and creates a data set that indicates partitions such as a list of country codes. The fork loop has a parameter indicating number of threads. When it is entered, it creates n instances of the fork (in this case Process 2). It then begins iterating over the data set and handing each entry as a parameter to the next available thread. The thread executes Process 2. When finished, it returns to the fork for the next row in the data set (as a parameter). In this case, n instances of Process 2 execute simultaneously. When the data set is expended and all the threads are done, Process 4 is entered. Fork loop and join are implemented as a single fork loop node (with no join node). The fork loop node’s child in this case would be Process 2.
The following display shows the fork loop and join pattern:
Fork Loop and Join Pattern
Fork Loop and Join Pattern

Dependency for Forked Process within Fork and over Fork Pattern

In this pattern, Process 3 is allowed to depend on Process 1. For example, Process 1 could produce data that Process 3 consumes. Also, Process 3 could depend on Process 2 (Process 2 produces data that Process 3 consumes). Similarly, Process 2b could depend on Process 1, and Process 4 could depend on Process 1. Finally, Process 4 could also depend on Process 2, Process 2b, and Process 3.
The following display shows the dependency for forked process within fork and over fork pattern:
Dependency for Forked Process within Fork and over Fork
Dependency for Forked Process Within Fork and Over Fork Pattern

Implicit Fork or Join Pattern

In this pattern, an implicit fork is created above Process 1 and Process 2. An implicit join is created below Process 1b and Process 2. All fork and join rules apply.
The following display shows the implicit fork or join pattern:
Implicit Fork or Join Pattern
Implicit Fork or Join Pattern

Circular Dependency Pattern

In this pattern, Process 1 can depend on Process 1b and the reverse can also be true. You must ensure that the job runs as desired. If a dependency on another node is not found, you can reconfigure in a number of ways.
The following display shows the circular dependency pattern:
Circular Dependency Pattern
Circular Dependency Pattern

Unsupported Patterns

Dependency between Forks Pattern

In this unsupported pattern, Process 2b cannot depend on Process 2 because the two processes are in the branches of different forks.
The following display shows the unsupported dependency between forks pattern:
Dependency between Forks Pattern
Dependency between Forks Pattern

Dependency into Fork Loop or Join Pattern

In this unsupported pattern, Process 4 could not depend on process 3 because multiple instances of Process 3 would have run, and there is no way to reference which one. However, Process 4 could depend on Process 1.
The following display shows the unsupported dependency into fork loop or join pattern:
Dependency into Fork Loop or Join Pattern
Dependency into Fork Loop or Join Pattern

Sequence between Forks Pattern

In this unsupported pattern, Process 2 cannot enter Process 3 because Process 3 is on a different fork. (It follows Process 2.) Also, Process 2 cannot enter Process 4 because it is outside the fork or join.
The following display shows the sequence between forks pattern:
Sequence between Forks Pattern
Sequence between Forks Pattern