Under certain circumstances, it is possible to perform parallel
WHERE clause subsetting on a table more than once and to receive slightly different results.
This event can occur when submitting parallel WHERE clause code that uses the SAS
OBS= data set option to
SPD Server .
The SAS OBS= data set option causes processing to end with the specified (nth) row
in a table. Because parallel WHERE clause processing is threaded, subsetting a table
and using OBS= might not produce identical
results from run to run. Different batch jobs using the same WHERE clause code might
produce slightly different results.
When a parallel WHERE-clause evaluation is split into multiple threads, the server
uses a multi-
threading model that is designed to return rows as fast as possible. Some threads might be
able to complete row scans incrementally faster than other threads, due to uneven
loads across multiple processors or system contention issues. This inequity can create
minute variances that can generate nonidentical results to the same subsetting request.
If you have code that performs parallel WHERE clause subsetting in conjunction with
the OBS= data processing option, and if it is critical
that successive WHERE clause subsets on the same data must be identical, you can eliminate
thread contention error by setting the thread count value for that operation to 1.
To set the server thread count value, you can use the SPDSTCNT
macro variable:
%let SPDSTCNT=1;
The same potential for
subsetting variation applies when a DATA step uses the
OBS=nnnn data
processing option with a parallel by-clause, such as the in this example:
data test1;
set spds45.testdata (obs=1000);
where j in (1,5,25);
by i;
run;
Use the SPDSTCNT macro
solution to ensure identical results across multiple identical table
subsetting requests.