Problem Note 49352: MP CONNECT programs that use SASESOCK pipeline parallelism might not respond because of blocking
Piping has a well known limitation in which a process can write only a finite amount of data to a pipe before the pipe reader consumes some of the data, making room in TCP/IP buffers for more data to be written. The term that is used for this limitation (that is, when a group of parallel processes simply stops working) is blocking. (This is often referred to as hanging.) For an example of such a process, with a detailed explanation, see the Full Code tab.
Whether a program appears to be blocked (or to be hung) depends on a number of factors, including the following:
- the program design
- the data and order of the data
- the TCP/IP buffer sizes
However, a change in the internal SAS® TCP/IP routines in SAS® 9.2 can prevent some programs from running successfully that did so prior to SAS 9.2.
Click the Hot Fix tab in this note to access the hot fix for this issue.
Note: Even after you apply the hot fix, your programs still might experience the blocking that is inherent in the program design, causing the programs to fail.
Operating System and Release Information
SAS System | SAS/CONNECT | z/OS | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Z64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Microsoft® Windows® for x64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Microsoft Windows Server 2003 Datacenter Edition | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Microsoft Windows Server 2003 Enterprise Edition | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Microsoft Windows Server 2003 Standard Edition | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Microsoft Windows Server 2003 for x64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Microsoft Windows Server 2008 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Microsoft Windows Server 2008 for x64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Microsoft Windows XP Professional | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Windows 7 Enterprise 32 bit | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Windows 7 Enterprise x64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Windows 7 Home Premium 32 bit | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Windows 7 Home Premium x64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Windows 7 Professional 32 bit | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Windows 7 Professional x64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Windows 7 Ultimate 32 bit | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Windows 7 Ultimate x64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Windows Vista | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Windows Vista for x64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
64-bit Enabled AIX | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
64-bit Enabled HP-UX | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
64-bit Enabled Solaris | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
HP-UX IPF | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Linux | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Linux for x64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
Solaris for x64 | 9.3_M1 | 9.4 | 9.3 TS1M1 | 9.4 TS1M0 |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
In the following sample code, task0 and task1 steps are both writing each input record simultaneously to two pipes: p1.dk p2.dk and m1.dk
m2.dk. Task2 and task3 are reading two pipes: p1.dk m1.dk and p2.dk m2.dk with the SET statement. The SET statement must read all the records from the first pipe before reading any data from the second. So no data will be read from m1.dk and m2.dk until all the data from p1.dk and p2.dk have been read. This is not a problem as long as task2 and task 3 can keep writing their output without waiting. But their output is going to 2 more pipes: p5.dk and p6.dk which again are being read by a SET statement in task4. Task4 does not read any data from p6.dk in the SET statement until it has read all the data from p5.dk. So task3 has to stop writing to p6 while it is still reading from p2.dk and m2.dk. This means it stops reading from p2.dk and m2.dk which in turn means that task0 and task1 have to stop writing to p2.dk and m2.dk. Everything is blocked waiting for task4 to read some data from p6.dk.
Note that the way the code is shown below WILL work until you change the limits on the two DO statements to some higher value that shows the blocking -- the value of 1500 in the comments definitely shows the problem.
libname detal 'c:\temp';
data detal.atest1;
array a {100} $1;
do i=1 to 50; /* set 1500 to replicate problem */
do j=1 to 100; a{j}='a';
end;
output;
end;
run;
data detal.atest2;
array a {100} $1 ;
do i=1 to 50; /* set 1500 to replicate problem */
do j=1 to 100; a{j}='b';
end;
output;
end;
run;
signon task0 sascmd="!sascmd";
signon task1 sascmd="!sascmd";
signon task2 sascmd="!sascmd";
signon task3 sascmd="!sascmd";
signon task4 sascmd="!sascmd";
rsubmit task0 wait=no;
libname detal 'c:\temp';
libname p1 sasesock ":15001" timeout=300;
libname p2 sasesock ":15002" timeout=300;
data p1.dk p2.dk;
set detal.atest1;
run;
endrsubmit;
rsubmit task1 wait=no;
libname detal 'c:\temp';
libname m1 sasesock ":130001" timeout=300;
libname m2 sasesock ":130002" timeout=300;
data m1.dk m2.dk;
set detal.atest2;
run;
endrsubmit;
rsubmit task2 wait=no;
libname p1 sasesock ":15001" timeout=300;
libname m1 sasesock ":130001" timeout=300;
libname p5 sasesock ":15005" timeout=300;
data p5.dk;
set p1.dk m1.dk;
_kor ='1';
run;
endrsubmit;
rsubmit task3 wait=no;
libname p2 sasesock ":15002" timeout=300;
libname m2 sasesock ":130002" timeout=300;
libname p6 sasesock ":15006" timeout=300;
data p6.dk;
set p2.dk m2.dk;
_kor ='2';
run;
endrsubmit;
rsubmit task4 wait=no;
libname detal 'c:\temp';
libname p5 sasesock ":15005" timeout=300;
libname p6 sasesock ":15006" timeout=300;
data detal.wynik1;
set P5.DK P6.DK;
run;
endrsubmit;
waitfor _all_ task0 task1 task2 task3 task4;
rget _all_ task0 task1 task2 task3 task4;
signoff task0;
signoff task1;
signoff task2;
signoff task3;
signoff task4;
Type: | Problem Note |
Priority: | high |
Date Modified: | 2013-03-11 13:33:00 |
Date Created: | 2013-03-05 10:01:56 |