Under certain circumstances,
it is possible to perform parallel WHERE clause subsetting on a table
more than once and to receive slightly different results. This event
can occur when submitting parallel WHERE clause code to SPD Server
that uses the SAS
OBS=nnnn data set option.
The SAS
OBS=nnnn data
set option causes processing to end with the specified (nth) observation
in a table. Because parallel WHERE clause processing is threaded,
subsetting a table and using
OBS=nnnn might
not produce identical results from run to run, or different batch
jobs using the same WHERE clause code might produce slightly different
results.
When a parallel WHERE-cause
evaluation is split into multiple threads, SPD Server uses a multi-threading
model that is designed to return rows as fast as possible. Some threads
might be able to complete row scans incrementally faster than other
threads, due to uneven loads across multiple processors or system
contention issues. This inequity can create minute variances that
can generate nonidentical results to the same subsetting request.
If you have code that
performs parallel WHERE clause subsetting in conjunction with the
OBS=nnnn data
processing option, and if it is critical that successive WHERE clause
subsets on the same data must be identical, you can eliminate thread
contention error by setting the thread count value for that operation
to 1.
To set the SPD Server
thread count value, you can use the SPDSTCNT= macro:
%let SPDSTCNT=1;
The same potential for
subsetting variation applies when a DATA step uses the
OBS=nnnn data
processing option with a parallel by-clause, such as:
data test1;
set spds45.testdata (obs=1000);
where j in (1,5,25);
by i;
run;
Use the SPDSTCNT= macro
solution to ensure identical results across multiple identical table
subsetting requests.