Under certain circumstances,
it is possible to perform parallel WHERE clause subsetting on a table
more than once and to receive slightly different results. This event
can occur when submitting parallel WHERE clause code to SPD Server
that uses the SAS
OBS=nnnn data set option.
The SAS
OBS=nnnn data set option causes processing to end with
the specified (nth) observation in a table. Because parallel WHERE
clause processing is threaded, subsetting a table and using
OBS=nnnn might not produce identical results from run
to run, or different batch jobs using the same WHERE clause code might
produce slightly different results.
When a parallel WHERE-cause
evaluation is split into multiple threads, SPD Server uses a multi-threading
model that is designed to return rows as fast as possible. Some threads
might be able to complete row scans incrementally faster than other
threads, due to uneven loads across multiple processors or system
contention issues. This inequity can create minute variances that
can generate nonidentical results to the same subsetting request.
If you have code that
performs parallel WHERE clause subsetting in conjunction with the
OBS=nnnn data processing option, and if it is critical
that successive WHERE clause subsets on the same data must be identical,
you can eliminate thread contention error by setting the thread count
value for that operation to 1.
To set the SPD Server
thread count value, you can use the SPDSTCNT= macro:
%let SPDSTCNT=1;
The same potential for
subsetting variation applies when a DATA step uses the
OBS=nnnn data processing option with a parallel by-clause,
such as:
data test1;
set spds45.testdata (obs=1000);
where j in (1,5,25);
by i;
run;
Use the SPDSTCNT= macro
solution to ensure identical results across multiple identical table
subsetting requests.