About Scheduling Data Queries

How Does the Scheduling Feature Work?

When you have a data query open in the workspace, you can click the Schedule button to schedule the data query. When you schedule a data query, the data builder performs the following operations:
  1. Creates a job that performs the data query operations.
  2. Creates a deployed job from the job.
  3. Places the job into a new deployed flow.
  4. Schedules the flow on a scheduling server.
You can reschedule the data query based on specified conditions (for example, run immediately or run whenever a trigger condition is met).
The job, deployed job, and deployed flow are metadata objects. The data builder stores them in the same metadata folder with the data query. The metadata objects are named based on the following pattern:
vdb_name_timestamp
Note: Up to 42 characters from the data query name are used as the name.
When the specified conditions are met, the data query is run with the user ID that scheduled it. This is the behavior for the Operating System Services Scheduler.

Edit Data Queries That Are Already Scheduled

If you edit a data query that is already scheduled, you must click the Schedule button again so that the SAS statements for the data query are regenerated and saved.

Caution about Scheduling Data Queries to Run Now

When you schedule a data query, one of the options is to run it immediately. Select Run now in the Schedule window.
Performing the following steps results in an error condition:
  1. Use a SAS data set for the output table of the data query.
  2. Run the data query.
  3. Click the Results tab to look at the output.
  4. Schedule the data query by selecting Run now.
These steps result in an error condition because SAS unlocks a SAS data set when it is opened for reading. When step 3 is performed, the output table is locked, and no other process can overwrite the output table. The following message is included in the SAS log:
Locked Error Message
ERROR: A lock is not available for OUTPUTTABLE.
ERROR: Lock held by process xxxx.
You can avoid this error condition. If you want to schedule the data query to run now, close it, open it again, and then schedule it to run now. Alternatively, you can schedule the data query to run in the future, and then close the data query.

Scheduling Preferences

Default Scheduling Server

By default, your deployment includes a server that is named Operating System Services – hostname.example.com. This server is used as the default scheduling server.
Use the Server Manager plug-in to SAS Management Console to identify the scheduling servers that are included in your deployment. You can specify a different scheduling server in your application preferences. Any data queries that you schedule after you specify a different scheduling server will use the new scheduling server.
Some deployments include the Platform Suite for SAS server. To use this server, change the scheduling server. The default name is Platform Process Manager.
In all cases, when you schedule a new data query, the data builder retrieves your default scheduling server, and uses that value to look up the scheduling server in SAS metadata. The data builder uses the first server that matches the value in SAS metadata. Including the host name, such as Operating System Services – hostname.example.com ensures that the data builder uses the server that you specify.

Default Batch Server

By default, your deployment includes a server that is named SASApp – SAS DATA Step Batch Server. This server is used as the default batch server.
You can specify a different batch server in your application preferences. Consider the following before you change the default batch server:
  • The batch server must be registered in metadata as a component of a SAS Application Server that you can access.
  • You must specify the same SAS Application Server as your default application server in your preferences.
As with the default scheduling server, the data builder retrieves your default batch server, and uses that value to look up the batch server in SAS metadata the first time you schedule the data query. The data builder uses the first server that matches the value in SAS metadata.

Default Deployment Directory

A deployment directory is a SAS metadata object that represents the following items:
  • the name of the SAS Application Server with which the deployment directory is associated (the default value is SASApp)
  • a name for the deployment directory (the default value is Batch Jobs)
  • the path to the deployment directory (the default value is SAS-config-dir/Lev1/SASApp/SASEnvironment/SASCode/Jobs)
When you schedule a data query, the SAS statements for the data query are saved in a file. The file is saved in the path that is associated with the deployment directory. The file is named based on the same pattern that is described in How Does the Scheduling Feature Work?.
The data builder looks up the SAS Application Server in the SAS Metadata Server using your scheduling server preference setting. The initial value is SASApp. If a matching server name is not found, then the data builder uses the first application server that is returned. After the server is determined, the data builder looks up the deployment directory in that server context that matches your scheduling server preference setting. If a matching deployment directory is not found, then the data builder uses the first deployment directory that is returned.
You can specify a different name for the default deployment directory. For more information about deployment directories and using the Server Manager plug-in to SAS Management Console, see Scheduling in SAS.

When Are the Scheduling Preferences Used?

Any of the preferences that you change are used the next time you create a data query and schedule it. If you edit an existing data query that is already scheduled, the existing settings for the scheduling server, batch server, and deployment directory are not updated with the changes. To change the settings for existing data queries that are already scheduled, use SAS Management Console to redeploy the deployed job for the data query.