Transformations in a SAS Data Integration Studio
job can produce the following types of intermediate files:
-
procedure utility files that are
created by the SORT and SUMMARY procedures when these procedures are
used in the transformation
-
transformation temporary files
that are created by the transformation as it is working
-
transformation output tables that
are created by the transformation when it produces its result; the
output for a transformation becomes the input to the next transformation
in the flow
By default, procedure
utility files, transformation temporary files, and transformation
output tables are created in the WORK library. You can use the -WORK
invocation option to force all intermediate files to a specified location.
You can use the -UTILLOC invocation option to force only utility files
to a separate location.
Knowledge of intermediate
files helps you to perform the following tasks:
-
View or analyze the output tables
for a transformation and verify that the output is correct.
-
Estimate the disk space that is
needed for intermediate files.
These intermediate files
are usually deleted after they have served their purpose. However,
it is possible that some intermediate files might be retained longer
than desired in a particular process flow. For example, some user-written
transformations might not delete the temporary files that they create.
Utility files are deleted by the SAS procedure that created them. Transformation temporary
files are deleted by the transformation that created them. When a SAS Data Integration
Studio job is executed in batch, transformation output tables are deleted when the
process flow
ends or the current server session ends.
When a job is executed interactively in SAS Data Integration Studio, transformation
output tables
are retained until the Job Editor window is closed or the current server session is
ended in some other way (for example, by selecting
ActionsStop from the menu. For information about how transformation
output tables can be used to debug the transformations in a job, see
Reviewing Temporary Output Tables. However, as
long as you keep the job open in the Job Editor window, the output
tables remain in the WORK library on the SAS Workspace Server that
executed the job. If this is not what you want, you can manually delete
the output tables, or you can close the Job Editor window and open
it again, which will delete all intermediate files.
Here is a post-processing
macro that can be incorporated into a process flow. It uses the DATASETS
procedure to delete all data sets in the Work library, including any
intermediate files that have been saved to the Work library.
%macro clear_work;
%local work_members;
proc sql noprint;
select memname
into :work_members separated by ","
from dictionary.tables
where
libname = "WORK" and
memtype = "DATA";
quit;
data _null_;
work_members = symget("work_members");
num_members = input(symget("sqlobs"), best.);
do n = 1 to num_members;
this_member = scan(work_members, n, ",");
call symput("member"||trim(left(put(n,best.))),trim(this_member));
end;
call symput("num_members", trim(left(put(num_members,best.))));
run;
%if &num_members gt 0 %then %do;
proc datasets library = work nolist;
%do n=1 %to &num_members;
delete &&member&n
%end;
quit;
%end;
%mend clear_work;
%clear_work
Note: The previous macro deletes
all data sets in the Work library.
The transformation output tables for a process flow remain until the SAS session that
is associated with the flow is terminated. Analyze the process flow and determine
whether there are output tables that are not being used (especially if these tables
are large). If so, you can add transformations to the flow that deletes these output
tables and free up valuable disk space and memory. For example, you can add a
generated transformation that deletes output tables at a certain point in the flow. For details about generated
transformations, see
Creating and Using a Generated Transformation.