|
How can I use the DATA step PUT function with a user-written format in a WHERE clause to subset a data set accessed through a SAS/SHARE server? INPUT: I want to use the DATA step PUT function with a user-written
format in a WHERE clause to subset my data set. When I access
the data set through a SAS/SHARE server, I receive FORMAT NOT
FOUND error messages. I understand that this is because the
WHERE expression is evaluated by the server and the format
resides only in my user session, but I'd like to prevent the
server from doing the WHERE clause evaluation or, alternatively,
make my formats accessible to it. What are my options in this
situation?
OUTPUT: In Version 6 of SAS/SHARE software, it is not possible to
control whether the WHERE evaluation takes place in the user's
or the server's session. SAS Institute expects to add this
functionality in Version 7 of the SAS System. Having the WHERE
clause evaluation take place in the server's session optimizes
performance. When the server evaluates a WHERE clause, only the
subset of observations that meet the WHERE criteria are sent via
the REMOTE engine to the user's session. Since this is
typically an "expensive" process, minimizing the number of
observations transmitted improves performance.
It is possible to make your user-written formats available to
the server's session so that the technique of using the PUT
function in a WHERE clause will behave as expected. The rules
controlling the order in which format catalogs are searched are
the same for the server as for a typical user: by default,
WORK.FORMATS is searched first, then LIBRARY.FORMATS, assuming
some SAS library has been associated with the libref of LIBRARY.
Beginning in Release 6.07, additional format catalogs can be
specified with the FMTSEARCH= system option, documented in
Technical Report P-222, Changes and enhancements to Base SAS
Software, Release 6.07. We will examine each of these
alternatives in order.
Alternative 1: WORK.FORMATS - A user's session can access the
server's WORK library using the SLIBREF= option of the LIBNAME
statement. For example:
LIBNAME LIBRARY SERVER=serverid SLIBREF=WORK;
A subsequent:
PROC FORMAT LIBRARY=LIBRARY;
VALUE SEXFMT 1='F' 2='M';
RUN;
places the user-written format in the server's WORK library.
Naming the format in a WHERE clause referencing a data set under
the server's control (e.g. WHERE PUT(SEX,SEXFMT.) EQ 'F')
executes without errors, since the server finds the format in
the WORK.FORMATS catalog. Additionally, a reference in the
user's session to the format, (e.g. PUT SEX SEXFMT.) also
executes correctly, since the user is pointing to the server's
WORK library with the reserved libref of LIBRARY. Note that
with this technique, all users of the server would be sharing
access to the server's WORK.FORMATS catalog. Therefore, each
user should ensure that his or her format names are unique.
Using the &SYSJOBID automatic macro variable to construct the
format name would be one possible method to accomplish this.
Alternative 2: LIBRARY.FORMATS - The most important factor to
keep in mind with this alternative is how the libref LIBRARY is
defined to the server's session. Many applications take
advantage of a server's ability to automatically generate
librefs, but this alternative requires you to use a specific
reserved libref ("LIBRARY") that is already defined to the
server. This kind of library is known as an administrator-
defined library. (For more information on administrator- and
user-defined libraries, refer to SAS/SHARE Software, Usage and
Reference, Version 6, First Edition, Chapter 7, 'Server Library
Reference', and your host-specific SAS/SHARE Technical Report.)
Two common techniques for establishing an administrator-defined
server library are: (1) a LIBNAME statement in the server's
session, and (2) PROC OPERATE with an ALLOCATE LIBRARY
statement. For example:
PROC OPERATE SERVER=serverid;
ALLOC LIB LIBRARY 'data.set.name';
QUIT;
The user then specifies:
LIBNAME LIBRARY SERVER=serverid;
PROC FORMAT LIBRARY=LIBRARY;
VALUE SEXFMT 1='F' 2='M';
RUN;
In this case, both the user and the server locate the format in
the FORMATS catalog of the library allocated with the reserved
libref of LIBRARY. Note that this technique, all users of the
server would be sharing access to the server's LIBRARY.FORMATS
catalog. Therefore, each user should ensure that his or her
format names are unique, just as with Alternative 1.
Alternative 3: FMTSEARCH= - The FMTSEARCH= system option
provides the ability to add other format catalogs to the default
search order and, optionally, to place other catalogs before
WORK.FORMATS and LIBRARY.FORMATS in the search order. The
FMTSEARCH= option must be specified in the server's session
prior to the PROC SERVER statement that starts the server (e.g.
in the server's CONFIG file).
This alternative offers a partial solution to the potential
problem of format name "collisions" noted in the other two
alternatives. By using the FMTSEARCH= alternative, each user
could create and reference formats in a different catalog in a
common library, rather than all users referencing the FORMATS
catalog. However, the server will still have one list of
formats to look through, even though the list may be distributed
through a series of catalogs.
Concatenating a series of catalogs presents a variety of
opportunities for organizing formats. For example, a format
catalog could be assigned to each application, each department,
or some other logical grouping of users. A format catalog could
also be used to group formats according to the kind of data they
process of the kind of output they generate.
For example, adding:
FMTSEARCH=(TEST.USERFMT)
to the server's CONFIG file, then issuing a PROC OPERATE
statement, but for this alternative specifying:
PROC OPERATE SERVER=serverid;
ALLOC LIB TEST 'data.set.name';
to allocate the library to the server. The user can then
submit:
LIBNAME TEST SERVER=serverid;
OPTIONS FMTSEARCH=(TEST.USERFMT);
PROC FORMAT LIBRARY=TEST.USERFMT;
VALUE SEXFMT 1='F' 2='M';
RUN;
In this case, both the user's and the server's sessions locate
the format based on the modified catalog search order specified
by the FMTSEARCH system option.
Typically, using the reserved libref LIBRARY (the second
alternative) provides the most straightforward solution to your
problem, since it (1) requires the least amount of additional
code, (2) requires no changes to (or concern for) the amount of
DASD space allocated to the server's WORK library, and (3) makes
use of "standard" user-written format access rules which many
users are already familiar with.
Beginning in TSLEVEL 415 of Release 6.08 and TSLEVEL 039 of
Release 6.09, enhancements made to the REMOTE engine provide
improved performance to catalogs accessed through a server.
These changes are included with the REMOTE engine by default.
No coding changes are required to either application code or the
PROC SERVER statements to take advantage of the enhanced catalog
access routines.
|