Problem Note 47397: A segmentation violation and other errors might occur when your environment is under a heavy load and when resources are limited
When the SAS Object Spawner is configured with the grid load-balancing algorithm. a segmentation violation might occur causing the object spawner to stop handling requests. This problem can happen when your environment is heavily loaded and has limited resources available. If that happens, the following errors are written to the SAS Object Spawner log:
ERROR PROVIDER(Platform): Communication time out
ERROR PROVIDER(Platform): No matching job found
. . .gridplat segmentation violation dump. . .
ERROR IOM call failed. Internal server exception: access violation.
ERROR Failed to process peer request.
Currently, there is no workaround to avoid the segmentation violation. However, you should implement the following recommendations to prevent the communication time-out error:
- Ensure that the LSF master machine is not overloaded.
- Ensure that the network bandwidth between the machines is not overloaded.
- Set the LSB_QUERY_PORT parameter so that the master machine spawns threads instead of processes to handle requests. For details about the LSB_QUERY_PORT parameter, see the Platform LSF Configuration Reference Documentation for the Platform LSF version that you have configured.
- Ensure that the communication time-out parameters (LSB_API_CONNTIMEOUT, LSB_API_RECVTIMEOUT, LSF_API_CONNTIMEOUT, LSF_API_RECVTIMEOUT) are set high enough but do not overload the system. The default values are 10, 10, 5, and 20, respectively. For more details about these time-out parameters, see the Platform LSF Configuration Reference Documentation for the Platform LSF version that you have configured.
- Ensure that the number of simultaneous connection requests that are sent to the SAS Object Spawner can be handled by the system and the network resources in a grid environment.
Click the Hot Fix tab in this note to access the hot fix for this issue.
This hot fix only prevents the segmentation violation from occurring. In addition to applying the hot fix, you should implement the following recommendations listed previously in order to prevent the communication time-out error.
Operating System and Release Information
SAS System | SAS Grid Manager | 64-bit Enabled HP-UX | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
64-bit Enabled AIX | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
64-bit Enabled Solaris | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
HP-UX IPF | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Linux | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Linux for x64 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Solaris for x64 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft® Windows® for 64-Bit Itanium-based Systems | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft Windows Server 2003 Datacenter 64-bit Edition | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft Windows Server 2003 Enterprise 64-bit Edition | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft Windows XP 64-bit Edition | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft® Windows® for x64 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft Windows Server 2003 Datacenter Edition | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft Windows Server 2003 Enterprise Edition | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft Windows Server 2003 Standard Edition | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft Windows Server 2003 for x64 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft Windows Server 2008 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft Windows Server 2008 for x64 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Microsoft Windows XP Professional | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Windows 7 Enterprise 32 bit | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Windows 7 Enterprise x64 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Windows 7 Home Premium 32 bit | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Windows 7 Home Premium x64 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Windows 7 Professional 32 bit | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Windows 7 Professional x64 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Windows 7 Ultimate 32 bit | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Windows 7 Ultimate x64 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Windows Vista | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
Windows Vista for x64 | 9.21 | 9.3 | 9.2 TS2M3 | 9.3 TS1M2 |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
The following errors occur when the SAS Object Spawner is configured with the grid load-balancing algorithm and is under a heavy load: "ERROR PROVIDER(Platform): Communication time out" and "ERROR PROVIDER (Platform): No matching job found"
Type: | Problem Note |
Priority: | medium |
Date Modified: | 2012-09-05 09:30:01 |
Date Created: | 2012-08-30 15:02:32 |