Processing Multiple Files of Data

Overview of Handling Large Files of Data

If you have multiple input files to stage each day, it can be time-consuming to run them all in sequence, one after the other. It is also laborious to create and maintain unique staging jobs for them. In some circumstances, it might be more efficient to stage the multiple files at the same time using the same staging job. To do so, you need to create mirror image copies of the staging library (and spin library, if required) that is associated with the staging job. You can then run the staging job multiple times, overriding the location of the libraries with paths to the locations of the copies that you made. This enables you to stage multiple input files at the same time.
Note: This feature does not include explicit grid enablement or the use of MPConnect.
The target staged tables can be combined into a single table or view for subsequent processing. This table or view can then be input to the aggregation job. The following flowchart shows this process:
Large Data Process
The following steps describe how to process multiple files of data. Detailed information about these processes is provided for each adapter following the general overviews.
Overview of Staging the Data
  1. Set up a staging job, the target staged table, and the appropriate staging library. Deploy the job.
  2. For every one of the multiple input files that you want to run at the same time, make a copy of the staging library. For example, if you want to process ten files of raw data at the same time, create ten mirror copies of the staging library. You can use the %RMMKLIKE macro to copy the staging libraries. For information about this macro, see %RMMKLIKE.
    Note: In situations where you have a large number of input files, you might want to consider grouping them into batches before creating the staging libraries. For example, if you have 100 files, you could separate them into batches of ten files each. Then, you would need to create only ten sets of libraries.
    If the original staging job generated a spin library, make copies of that library too. If your staging job writes target staged tables to multiple SAS libraries, you can use the same technique to override one or all of the SAS libraries at execution time.
    Note: The additional staging libraries can be located wherever you choose. However, make sure that you have Write access to the location and that there is sufficient disk space available at that location. (These objects should not be represented in the metadata library.)
  3. When the original job is executed, it ordinarily uses the source and target specifications, such as file locations and SAS libraries, that are defined in the metadata. You can override the input or output specifications, but ordinarily you would override both. By overriding both locations, you can have parallel alternative locations for input and output. The examples here display how to override both input and output specifications in parallel.
    For each of the multiple input files that you want to run, you need to redirect the source specification to your chosen input file. You also need to redirect the target staged table to the paths where the mirror copies of your staged libraries are located. For each of the multiple runs of the job, precede the deployed job code with SAS statements that redirect both input and output.
    You can use any of the following methods to override the paths to the source and target specification:
    • Write a SAS program that assigns the libref and then uses %INCLUDE to include the deployed code.
    • Run the deployed code with an autoexec file that contains the FILENAME or LIBNAME statements that will redirect the source and target locations.
    • Run the deployed code with SAS invocation-time options that assign the filerefs and librefs before running the deployed code.
    Note: In following the sections that provide detailed instructions for each of the adapters, only the first method is shown.
    Do not override any pre-assigned SAS libraries.
  4. Run all ten staging jobs at the same time.
  5. After all the staging jobs have completed successfully, the target staged tables can be input to the aggregation jobs.
Overview of Aggregating the Data
  1. Combine the multiple staged tables into a single view or table, which can be input to the aggregation job. You can use the %RMCMB macro to combine the staged tables into a view or single table. For more information, see %RMCMB.
  2. If the combined table (or view) resides in the same physical location that the aggregation job will use as input, as deployed, then you can run only the deployed aggregation job. If you want the aggregation job to point to an alternate location for input data, then you can override that location at run time just as you can for staging jobs.
    For more information, see Details for Aggregation.

Details for Amazon CloudWatch

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_ACW_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path, and the resulting staged tables are written to the alternate output location.

Details for ASG TMON2CIC

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref MONICICS. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME MONICICS “C:\Some\Other\Input\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_ASG_TMONCICS_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for ASG TMONDB2

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref TMDBIN. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME TMDBIN “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_ASG_TMONDB2_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for ASG TMONDB2 V5

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref TMD2IN. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME TMD2IN “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_ASG_TMONDB2_V5_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for BMC Mainview IMS

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref IMSLOG. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME IMSLOG “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_BMC_Mainview_IMS_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for BMC Perf Mgr

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_BMC_Perf_Mgr_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for CA TMS

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref TMC. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME TMC “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_CA_TMS_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for Comma Separated Values (CSV)

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_CSV_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for DT Perf Sentry

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_HP_Perf_Sentry_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for DT Perf Sentry with MXG

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref NTSMF. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME NTSMF “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_DT_Perf_Sentry_with_MXG_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for Ganglia

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_Ganglia_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for HP Perf Agent

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_HP_Perf_Agent_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for HP Reporter

Overriding the Source Library at Execution Time

Create a SAS program that redirects the input to either the path of a SAS library or a database library using the libref that was specified in the deployed code. In the example here, the job originally used an input library with the libref HPOVREP. (The libref must match the one used when the job was defined and deployed.) It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
LIBNAME reporter ORACLE PATH=XXX SCHEMA=XXX AUTHDOMAIN="OracleAuth" ; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_HP_Reporter_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for IBM AS400

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_IBM_AS400_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for IBM DCOLLECT

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref DCOLLECT. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME DCOLLECT “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_IBM DCOLLECT_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for IBM EREP

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref EREP. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME EREP “C:\Some\Other\Path” RECFM=S370VB LRECL=16384; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_IBM_EREP_Staging.sas’; 
Note: On Windows and UNIX, the FILENAME statement for EREP should also specify these SAS options: RECFM=S370VB and LRECL=16384.
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for IBM IMS

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_IBM_IMS_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for IBM SMF

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref SMF. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME SMF “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_IBM_SMF_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for IBM TPF

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref TPFIN. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME TPFIN “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_IBM_TPF_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for IBM VMMON

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref MWINPUT. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME MWINPUT “C:\Some\Other\Path” RECFM=F LRECL=4096; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_IBM_VMMON_Staging.sas’; 
Note: On Windows and UNIX, the FILENAME statement for MWINPUT should also specify these SAS options: RECFM=F and RECL=4096.
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for MS SCOM

Overriding the Source Library at Execution Time

For the MS SCOM adapter, the connection information is overridden. (The SAS library is not overwritten.) The input library for MS SCOM has connection information that describes how to connect to the SCOM database. The connection is accomplished by using SQL pass-through.
To override where the input is derived, you would not specify a new SAS LIBNAME statement before executing deployed code. Instead, you should define the SAS macro variable RM_SCOMConnection as the new connection information. For example, the original (as deployed) SAS Library might have been defined as follows:
LIBNAME Srvr2008 ODBC NOPROMPT="dsn=ISD_DWMG02;uid=itmRO;pwd=Original;"
AUTHDOMAIN="DefaultAuth"; 
To override the location of the input data, in this case to specify a different DSN, User ID, and password, define RM_SCOMConnection with this revised connection information:
%LET RM_SCOMConnection=NOPROMPT="dsn=ISD_DWMG04;uid=itmAlt;pwd=Revised;" 
AUTHDOMAIN="DefaultAuth";
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_SCOM_Staging.sas’;
When this is executed, the data is read from the database using the connection information provided in the RM_SCOMConnection macro variable. The resulting staged tables are written to the alternate output location.

Details for RRDtool

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_RRDtool_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for SAP ERP

Overriding the Source Library at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref SAP. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
LIBNAME SAP “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_SAP_ERP_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for SAR

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_SAR_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for SNMP

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_SNMP_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for VMware vCenter

Overriding the Source Library at Execution Time

Create a SAS program that redirects the input to either the path of a SAS library or a database library using the libref that was specified in the deployed code. In the example here, the job was originally defined using an input SAS library with the libref VMWARE. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
LIBNAME vmware ODBC DATASRC=VMware_XPDesktop SCHEMA=dbo AUTHDOMAIN="DefaultAuth";
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_VMware_vCenter_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for VMware Data Acquisition

Overriding the Source Library at Execution Time

Create a SAS program that redirects the input to either the path of a SAS library or a database library using the libref that was specified in the deployed code. In the example here, the job was originally defined using an input SAS library with the libref VMWARE. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
LIBNAME vmware ODBC DATASRC=VMware_XPDesktop SCHEMA=dbo AUTHDOMAIN="DefaultAuth";
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_VMware_Data_Acquisition_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for Web Log

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
%INCLUDE ‘Original_Web_Log_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for User-Written Staging Code

Overriding Input Filenames and Output Libraries at Execution Time

Create a SAS program that redirects the input file location to another path using the fileref RAWDATA. It also redirects the output (staged table) library to another path, using the SAS library libref that was used in the original deployed job:
FILENAME RAWDATA “C:\Some\Other\Path”; 
LIBNAME STGnnnn “C:\Some\Other\Staging\Library”;
INCLUDE ‘Original_User_Written_Staging.sas’; 
When this is executed, the data is read from the user-supplied alternative path and the resulting staged tables are written to the alternate output location.

Details for Aggregation

Overriding Input and Output Libraries at Execution Time

Create a SAS program that redirects the input (staged table) library to another path and the output (aggregation table) library to another path. Both of these are done by using a SAS LIBNAME statement. Each contains the same SAS library libref that was used in the original deployed job:
LIBNAME STGnnnA “C:\Some\Other\Staging\Library”; 
LIBNAME AGGnnnB “C:\Some\Other\Aggregation\Library”; 
%INCLUDE ‘Original_System_Aggregation.sas’;  
When this is executed, the data is read from staged tables found in the user-supplied alternative path. The resulting aggregation tables are written to the alternate output location.