You are here: Folders Riser Bar>Maintaining Process Jobs>Running Jobs from the Command Line

DataFlux Data Management Studio 2.6: User Guide

Running Jobs from the Command Line

Command Line Syntax

You can use the dmpexec command to execute profiles, data jobs, or process jobs from the command line. The executable file for this command is installed with both DataFlux Data Management Studio and DataFlux Data Management Server. The most commonly used options are listed in the following table.

Option Purpose Examples
-j file Executes the job in the specified file -j "C:\Program Files\DataFlux\DMServer\[instance_name]\var\batch_jobs\TestJob.ddf"
-l file Writes the log to the specified file -l "C:\LogLocation\Log.txt"
-c file Reads configuration from -c "C:\ConfigLocation\FileName.cfg"
-i key=value Specifies job input variables -i "PATH_OUT=C:\TEMP\" -i "FILE_OUT=out.txt"
-o key=value Overrides settings in the configuration file. -o "MACROVAR=X"

or

-o "BASE/LOGCONFIG_PATH=<path>" path to the dmserver etc directory containing the batch.log.xml file
-b key=value Specifies job options for the job being run. Typically used for the limited set of options that are described at right. Should not be used for options that can be set using the –o option.

-b "REPOSITORY=TestingRepos" i.e., the DataFlux Data Management Studio repository in which the job should run

or

-b "STATE_MODE=FULL"

Note that possible states are FULL (full state file with worktables), LAST (latest execution of each node with worktables), and STATS (full state file without worktables).

or

-b ="STATE_DIR=C:\TEMP”, a specified directory

-a no value | -c file Attempt to authenticate with the Authentication Server that is specified in the BASE/AUTH_SERVER_LOC option.

-a

or

-a -c "C:\Program Files\DataFlux\DMServer\[instance_name]\etc\batch.cfg"

Command Line Usage Notes

Using the dmpexec Command for DataFlux Data Management Studio

DataFlux jobs are stored in repositories. When you work with jobs interactively in DataFlux Data Management Studio, the parent repository for a selected job is always known to the application. However, when you execute jobs in batch mode, the parent repository for job is not known to the batch command. Accordingly, in most case you will specify the parent repository in the batch command that is used to execute a job.

You must specify the repository if the job uses one of the following:

The repository to use is specified with the BASE/REPOS_FILE_ROOT configuration option. It points the directory that you normally specify as the File Storage folder when you define the repository in DataFlux Data Management Studio. Here is an example to execute a job that does not have dependencies on a specific repository:

"C:\Program Files (x86)\DataFlux\DMStudio\[instance_name]\bin\dmpexec"
-j "C:\ProgramData\DataFlux\DMStudio\[instance_name]\Repository\Sample\FileStorage\batch_jobs\my_data_job.ddf"
-l "C:\TEMP\my_data_job.log"

If the job has dependencies on other items stored in the repository, the invocation would be as follows:

"C:\Program Files (x86)\DataFlux\DMStudio\[instance_name]\bin\dmpexec"
-j "C:\ProgramData\DataFlux\DMStudio\[instance_name]\Repository\Sample\FileStorage\batch_jobs\my_data_job.ddf"
-l "C:\TEMP\my_data_job.log"
-o "BASE/REPOS_FILE_ROOT= C:\ProgramData\DataFlux\DMStudio\[instance_name]\Repository\Sample\FileStorage"

Jobs with Domain-Enabled Data Connections. If the job uses data connections, and the data connections specify domains (you are using Domain-Enabled connections), you will need to specify options so dmpexec knows where to find the authentication provider to use for authentication. The data connection will then request outbound logins from the authentication provider. You can specify the authentication credentials on the command line directly, or you can store them in a separate text file and use the -c option on the command line to point to it. The following example stores the credentials in a separate file called C:\my_auth_credentials.txt that contains the following statements:

BASE/AUTH_SERVER_USER = <userid>
BASE/AUTH_SERVER_PASS = <password>

BASE/AUTH_SERVER_USER and BASE/AUTH_SERVER_PASS specify the user name and password to present when connecting to the authenticating server that is specified by the BASE/AUTH_SERVER_LOC option. Typically, BASE/AUTH_SERVER_LOC will be specified at installation time in the app.cfg file. Note that the BASE/AUTH_SERVER_LOC option can point to a DataFlux Authentication Server or to a SAS Metadata Server. Here is an example of how to run the job using these credentials.

"C:\Program Files (x86)\DataFlux\DMStudio\[instance_name]\bin\dmpexec"
-j "C:\ProgramData\DataFlux\DMStudio\[instance_name]\Repository\Sample\FileStorage\batch_jobs\my_data_job.ddf"
-l "C:\TEMP\my_data_job.log"
-o "BASE/REPOS_FILE_ROOT= C:\ProgramData\DataFlux\DMStudio\[instance_name]\Repository\Sample\FileStorage"
-c "C:\my_auth_credentials.txt"

Using the dmpexec Command for DataFlux Data Management Server

By default, the dmpexec command for DataFlux Data Management Server is installed in the [install]\[instance_name]\bin folder on the server machine. Using this command, you execute jobs using the DataFlux Data Management Server. Note that the dmpexec command does not communicate with a running DataFlux Data Management Server. It starts its own process using the software that has been installed for DataFlux Data Management Server. If you want to execute jobs on a running server, you should use the SOAP interface, which is described in the DataFlux Data Management Server: Administrator's Guide.

When using dmpexec on a DataFlux Data Management Server, keep in mind that the dmserver.cfg file will not be used by default. If your job requires any of the configuration options that are specified in this file, you will need to make the needed options available to dmpexec. This can be done by adding the following option to your dmpexec invocation:

-c "<full-path-to-file>\dmserver.cfg"

When using dmpexec, you must specify the location of the repository if the job uses one of the following:

The repository to use is specified with the BASE/REPOS_FILE_ROOT configuration option. In a default installation, this should point to the DataFlux Data Management Server var directory. Here is a sample invocation:

"C:\Program Files\SASHome\DataFluxDataManagementServer\[instance_name]\bin\dmpexec"
-j "C:\Program Files\SASHome\DataFluxDataManagementServer\[instance_name]\var\batch_jobs\my_data_job.ddf"
-l "C:\TEMP\my_data_job.log"
-o "BASE/REPOS_FILE_ROOT=C:\Program Files\SASHome\DataFluxDataManagementServer\[instance_name]\var"

BASE/REPOS_FILE_ROOT configuration option. When dmpexec runs a job on a DataFlux Data Management Server that references other jobs using relative paths, such as "dfr//…", you must set the BASE/REPOS_FILE_ROOT configuration option and point it to the var directory. In a default deployment this you would specify it as follows:

BASE/REPOS_FILE_ROOT = C:\Program Files\DataFlux\DMServer\[instance_name]\var

Secured DataFlux Data Management Server. When using dmpexec to run jobs on a secured DataFlux Data Management Server , you must use the -a option. The command will not honor the DMSERVER/SECURE=YES configuration option.

The following options enable you to specify authentication credentials.  

BASE/AUTH_SERVER_USER = <userid>
BASE/AUTH_SERVER_PASS = <password>

BASE/AUTH_SERVER_USER and BASE/AUTH_SERVER_PASS specify the user name and password to present when connecting to the authenticating server that is specified by the BASE/AUTH_SERVER_LOC option. Typically, BASE/AUTH_SERVER_LOC will be specified at installation time in the app.cfg file. Note that the BASE/AUTH_SERVER_LOC option can point to a DataFlux Authentication Server or to a SAS Metadata Server.

The BASE/AUTH_SERVER_USER and BASE/AUTH_SERVER_PASS options can be specified in a configuration file and then used as part of a dmpexec command using the -c option, as shown in previous sections. It is also possible to specify them directly on the dmpexec command by using the -b option.

Jobs with Domain-Enabled Data Connections. If the job uses data connections, and the data connections specify domains (you are using Domain-Enabled connections), you will need to specify options so dmpexec knows where to find the authentication provider to use for authentication. The data connection will then request outbound logins from the authentication provider. You can specify the authentication credentials on the command line directly, or you can store them in a separate text file and use the -c option on the command line to point to it. The following example stores the credentials in a separate file called C:\my_auth_credentials.txt that contains the following statements:

BASE/AUTH_SERVER_USER = <userid>
BASE/AUTH_SERVER_PASS = <password>

BASE/AUTH_SERVER_USER and BASE/AUTH_SERVER_PASS specify the user name and password to present when connecting to the authenticating server that is specified by the BASE/AUTH_SERVER_LOC option. Typically, BASE/AUTH_SERVER_LOC will be specified at installation time in the app.cfg file. Note that the BASE/AUTH_SERVER_LOC option can point to a DataFlux Authentication Server or to a SAS Metadata Server.

Here is an example to run the job using these credentials.

"C:\Program Files\SASHome\DataFluxDataManagementServer\[instance_name]\bin\dmpexec"
-j " C:\Program Files\SASHome\DataFluxDataManagementServer\[instance_name]\var\batch_jobs\my_data_job.ddf"
-l "C:\TEMP\my_data_job.log"
-o "BASE/REPOS_FILE_ROOT= C:\Program Files\SASHome\DataFluxDataManagementServer\[instance_name\var"
-a
-c "C:\my_auth_credentials.txt"

Run a Job That References Other Jobs, etc.

If you use a dmpexec command to execute a job that references other jobs, profiles, rules, tasks, custom metrics, sources, or fields in a DataFlux repository, then both the job and these objects must reside in the same repository.

Run a Job Whose Run-Time Statistics Will Be Displayed in SAS Job Monitor

SAS® Job Monitor is an optional component in SAS® Environment Manager. It reads job logs at specified locations and displays run-time statistics from the logs. You can use a dmpexec command to execute a job whose statistics will be displayed in SAS Job Monitor. The following special considerations apply when you use a dmpexec command to run a job whose run-time statistics will be displayed in SAS Job Monitor.

Identify Paths to Jobs

You can use the -i, -b and -o options multiple times to set multiple values.

If you will be submitting jobs through the command line on a regular basis, you might want to document the physical paths to data jobs and process jobs that you work with. The interface displays the paths to these jobs, but only in an abbreviated form. You can perform the following steps to identify the paths to data jobs and process jobs:

  1. Go to the Administration riser and select the repository that contains the job that you want to run from the command line.
  2. Note the path that is displayed in the File storage field for the repository.
  3. Open Windows Explorer and navigate to the final directory in the path. Then navigate further until you find the folder that contains the job that you need. For example, the fully qualified path to the dfsample_concatenate data job might be similar to the following:

    C:\Documents and Settings\All Users\Application Data\DataFlux\DataManagement\[instance_name]\Repository\Sample\FileStorage\sample\Data Jobs\dfsample_concatenate.ddf.

    Note that file has a .ddf file extension and is found in the Data Jobs directory. However, a process job from the same repository could have a path similar to the following:

    C:\Documents and Settings\All Users\Application Data\DataFlux\DataManagement\[instance_name]\Repository\Sample \FileStorage\sample\Process Jobs\dfsample_echo.dfj.

    This job has a .djf extension and is found in the Process Jobs directory.

Note that process jobs that contain referenced jobs in their flows can sometimes fail to execute in batch mode. You can use the -o option to explicitly specify the repository for referenced jobs. Then you can execute these jobs in batch mode.

Run a Job from a Command File

A typical approach to running jobs from the command line is to create a .cmd file and add one or more dmpexec commands to that file. For example, you could create a file called runjob.cmd that contains the following syntax:

call dmpexec _command1

call dmpexec _command2

etc.

To run the commands in the runjob.cmd file, you would enter runjob at the command line. For example, the file to run a data job named dfsample_concatenate.ddf and create a log file would contain the following command:

call dmpexec -l "mylog.txt" -j "Fully_Qualified_Path\dfsample_concatenate.ddf"

By default, the fully-qualified path to dmpexec is similar to drive:\Program Files\DataFlux\DMStudio \[instance_name]\bin. Information about finding the fully-qualified path to your jobs is available in Identify Paths to Jobs.

Running a process job is similar. You can run a process job called dmsample_echo.djf and create a log file with a .cmd file that contains the following command:

call dmpexec -l "mylog.txt" -j "Fully_Qualified_Path\dmsample_echo.djf"

Run a Profile from a Command File

The command used to run a profile is somewhat different than the command for data jobs and process jobs. An intermediate process job (ProfileExec.djf) is used to run the profile, and the profile is specified by its Batch Run ID.

Profiles are not stored as files. Instead, they are stored as metadata. Accordingly, to run a profile from the command line, you must specify a Batch Run ID for the profile instead of a file path. To find this Batch Run ID, navigate to the Folder riser and select the profile that you need to run. The Batch Run ID is displayed in the Details section of the information pane for the profile.

Here is an example command that could be used in a .cmd file:

call dmpexec -j "install dir\DMStudio\[instance_name]\etc\repositories\ProfileExec.djf" -i "REPOS_NAME=Repository_Name" -i "JOB_ID=Batch Run ID"

Set the Maximum Pooled Process Option

When processes are reused too often, performance is sometimes reduced. Fortunately, you can specify the POOLING/MAXIMUM_USE option in the app.cfg file for DataFlux Data Management Studio that will control the maximum number of times a pooled process may be used. After the pooled process has been used the specified number of times, it is terminated. For information about the app.cfg file, see "Configuration Files" in the DataFlux Data Management Studio Installation and Configuration Guide.

Examine Return Codes from Command Line Jobs

You can review the return code from a job that you run from the command line. This code can be useful when you need to troubleshoot a failed job. The return codes are listed in the following table.

Return Code Description
0 Success
1 Job initialization failure
2 Job was canceled
3 Job failed during execution

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfU_T_DataJob_RunBatch.html