SUPPORT / SAMPLES & SAS NOTES
 

Support

Usage Note 30952: Tips for addressing unresponsive SAS® 9.1.3 Stored Process Servers, Part 1

DetailsAboutRate It

Part 1: How to restore unresponsive SAS Stored Process Servers with SAS® 9.1.3

SAS Technical Support has received reports of previously working SAS Stored Process Servers becoming unresponsive over time for unknown reasons. By unresponsive, we mean that the SAS Stored Process Servers are up and running, but no requests from client applications are getting through to the server. These servers might also be referred to as "hung" or "orphaned" SAS processes.

You might have encountered this problem in one or more ways. You might be the end user working with one of the BI client applications, such as SAS® Enterprise Guide®, SAS® Web Report Studio, the SAS® Add-in for Microsoft Office, the SAS® Information Delivery Portal, SAS® Stored Processes Web application, or another Web-based application. You click on a button expecting a report to be returned, but instead you receive a generic error or Java dump. You might be the systems administrator, who gets a call from the end user and determines that there is no stored process server responding, or has at least narrowed the problem down to involve a SAS server rather than a client.

What to do?

  1. Initially with all customers this starts as a "Put out the Fire" situation where Technical Support offers suggestions to confirm that the servers are down or unresponsive and clean up and recover from the problem immediately. This document provides tips to evaluate and restore your SAS Stored Process Servers.
  2. Then, a long-term strategy is needed to gather information and determine why the problem occurs. See SAS Note 30716 "Tips for addressing unresponsive SAS® 9.1.3 Stored Process Servers, Part 2" for a suggested approach.

Conducting Short-Term Troubleshooting to "put out the fire"

Use the following five-step approach to evaluate the status of and restore your SAS Stored Process Servers:

  1. Test the connection to the stored process server from the SAS® Management Console.
  2. Check the status of the stored process server ports by running the NETSTAT system command.
  3. Stop the object spawner.
  4. Check applicable server log files.
  5. Stop existing stored process servers and restart the object spawner.

Step 1: Test the Connection to the SAS Stored Process Server from the SAS Management Console

A basic test of stored process server functionality is available in the SAS Management Console. To conduct this test, follow the steps below:

  1. Open SAS Management Console. The hierarchy appears in the left pane.
  2. Click + to expand the Server Manager hierarchy.
  3. Click + to expand the SASMain server group hierarchy.
  4. Click + to expand the SASMain – Logical Stored Process Server object.
  5. Select the lowest level SASMain – Stored Process Server.
  6. In the right-hand window, select the connection: SASMain Stored Process Server Bridge.
  7. On the SAS Management Console menu, click Actions
  8. Click Test Connection.

A box with the message Test Connection Successful! appears if the test was successful.

Customers usually receive the following error in SAS Management Console when their stored process servers are unresponsive:

sam.S251.ex.msg: A problem occurred while connecting to a load balancing spawner. Check the spawner log for more details.

Step 2: Check the Status of the Stored Process Server Ports

The command line tool NETSTAT (network statistics) displays incoming and outgoing network connections, routing tables, and various network interface statistics. This tool is available on UNIX and Windows operating systems. Use this tool to provide information regarding the status of the ports on which a particular stored process server runs, which by default are ports 8611, 8621, and 8631.

Follow these steps:

  1. Execute the following command in a Command prompt window:
    • Windows

      prompt> netstat –ano | find "8611"

      Note: substitute "8621" or "8631" in order to check the other ports.

    • UNIX
    • prompt> netstat –an | grep "8611"

      Note: substitute "8621" or "8631" in order to check the other ports.

  2. Evaluate the port status.
    • "LISTENING" and "ESTABLISHED" are normal states.
    • "TIME_WAIT" and "CLOSE_WAIT" are normal states if the connection is shutting down.

    • Note: A state of "CLOSE_WAIT" that persists for longer than 2-5 minutes might indicate that the server is hung.

      See RFC793 (pages 20 and 21) for details on the progression of states for a TCP/IP connection.

      The appearance of NETSTAT output when port 8611 is unresponsive is shown in the example below.

      tcp4    1312      0  myserver.na.sas.8611  otherserver.na.sas.56480 CLOSE_WAIT
      tcp4    1254      0  myserver.na.sas.8611  otherserver.na.sas.56487 CLOSE_WAIT
      tcp4    1300      0  myserver.na.sas.8611  otherserver.na.sas.56498 CLOSE_WAIT
      tcp4       0         0  *.8611                 *.*                    LISTEN
      tcp4    1280      0  myserver.na.sas.8611  otherserver.na.sas.56805 CLOSE_WAIT
      tcp4    1011      0  myserver.na.sas.8611  otherserver.na.sas.56816 CLOSE_WAIT
      tcp4    1267      0  myserver.na.sas.8611  otherserver.na.sas.56822 CLOSE_WAIT
      tcp4    1234      0  myserver.na.sas.8611  otherserver.na.sas.56825 CLOSE_WAIT
      tcp4    1260      0  myserver.na.sas.8611  otherserver.na.sas.56828 CLOSE_WAIT
      tcp4       0         0  *.8621                 *.*                    LISTEN
      tcp4       0         0  *.8631                 *.*                    LISTEN
      tcp4    1299      0  myserver.na.sas.8611  otherserver.na.sas.56850 CLOSE_WAIT
      tcp4    1280      0  myserver.na.sas.8611  otherserver.na.sas.56854 CLOSE_WAIT
      tcp4    1311      0  myserver.na.sas.8611  otherserver.na.sas.56865 CLOSE_WAIT
      tcp4    1269      0  myserver.na.sas.8611  otherserver.na.sas.32775 ESTABLISHED
      tcp4    1273      0  myserver.na.sas.8611  otherserver.na.sas.32781 ESTABLISHED
      tcp4    1281      0  myserver.na.sas.8611  otherserver.na.sas.57282 CLOSE_WAIT
      tcp4    1272      0  myserver.na.sas.8611  otherserver.na.sas.32786 ESTABLISHED
      tcp4    1322      0  myserver.na.sas.8611  otherserver.na.sas.57288 CLOSE_WAIT
      tcp4    1284      0  myserver.na.sas.8611  otherserver.na.sas.32795 ESTABLISHED
      tcp4    1302      0  myserver.na.sas.8611  otherserver.na.sas.57298 CLOSE_WAIT
      tcp4    1296      0  myserver.na.sas.8611  otherserver.na.sas.32804 ESTABLISHED               
      

Step 3: Evaluate SAS Processes After Terminating the Object Spawner

By default, the SAS Stored Process Server is configured to execute using the sassrv user account.

If execution of the NETSTAT command reveals unresponsive servers, you should terminate the object spawner and search for remaining SAS processes that are owned by the sassrv user account. Any SAS processes owned by the sassrv account that persist after the Object Spawner shuts down are likely to be hung SAS Stored Process Servers (although other possible explanations exist).

Follow these steps:

  1. Stop the object spawner.
    • Windows

      Stop the object spawner service through the Windows Services Manager.

    • UNIX

      Locate the ObjectSpawner.sh file in your SAS Configuration directory. For example:

      SASROOT/BIArch/Lev1/SASMain/ObjectSpawner/ObjectSpawner.sh

      From a system prompt, submit the following:

      prompt> ObjectSpawner.sh stop

  2. Search for remaining SAS processes that still persist after you terminate the object spawner and are owned by the sassrv user account (or equivalent at your site):
    • Windows

      Use the Windows Task Manager.

      From the Processes tab, sort the process list by the User Name column and look for processes with an associated user name of sassrv.

    • UNIX

      Use the PS command.

      prompt> ps-ef | grep "sassrv"

    • AIX
      prompt>  ps –ef | grep "sassrv" | grep "8611"
      prompt>  ps –ef | grep "sassrv" | grep "8621"
      prompt>  ps –ef | grep "sassrv" | grep "8631"
      
    • HP/UX or Solaris
      prompt> ps –ef | grep "sassrv" | grep "sasexe/sas"
      prompt> ps –ef | grep "sassrv"
      

Step 4: Check Applicable Server Log Files

When stored process servers become unresponsive, error and warning messages might appear in log files for the object spawner, stored process servers, metadata servers, and (for Windows systems only) the Windows Event Viewer.

You should review all the log file types noted below.

Caution: In order to preserve these log files and document specific incidents, it is critical that you copy the Object Spawner logs to backup files before restarting the Object Spawner. The existing log files will be overwritten when Object Spawner starts.

Note: Use the SAS BI Color Coding and Reporting Tools to identify errors or warnings in the object spawner, stored process server, or metadata server log files. More information about these tools is available in Usage Note 19889: SAS® Business Intelligence Color Coding and Reporting Tools are available on the Download site.

Checking Object Spawner Logs

Go to the appropriate folder and view the object spawner log file.

  • Windows

    Windows path: C:\SAS\<projectdir>\Lev1\SASMain\ObjectSpawner\logs

    Windows filename: objspawn.log

  • UNIX

    UNIX path: <projectdir>/Lev1/SASMain/ObjectSpawner/logs

    UNIX filename: objspawn_console.log

    UNIX filename: objspawn.log

    Note: After a stored process server becomes unresponsive, you will find one or more of the following messages written to the object spawner log file named objspawn.log:

    ERROR: The tcpSockWriteVector call failed. The system error is 'Broken pipe'.
    ERROR: Bridge protocol engine socket access method failed to send vector to socket, error 32 (Broken pipe).
    ERROR: The Balance algorithm timed out before a server could be found.
    WARNING: The load balancing instance ldblCompRefDirectConnection call failed.
    WARNING: Unable to redirect the client request.
    
Checking Stored Process Server Logs

Go to the appropriate folder and view the stored process server log file.

  • Windows

    Windows filename: StoredProcessServer_%.log

    Windows path: C:\SAS\<projectdir>\Lev1\SASMain\StoredProcessServer\logs

  • UNIX

    UNIX filename: StoredProcessServer_%.log

    UNIX path: <projectdir>/Lev1/SASMain/StoredProcessServer/logs

It is possible that the log from an unresponsive stored process server will contain no errors. In this case, you should note the last step/program that executed successfully and the last entry written in the log.

Often the last entry in a hung stored process server log file will note that a request has started executing, similar to the example below:

20070223:11.01.32.78: 00000066: 9:sasdemo: STP: 1: Executing c:\mycode stp_report.sas

If servers are unresponsive and you restart object spawner, log entries similar to those below will appear in the logfile when the stored process server attempts to restart. See example below:

20061220:10.18.34.89: 00000061: :: SAS Stored Process Server Version 9.1 ( Build 50 )

20061220:10.18.34.89: 00000061: :: STP: applevel=2
20061220:10.18.35.42: 00000061: :: STP: Server Property: Default Session Timeout = 900
20061220:10.18.35.42: 00000061: :: STP: Server Property: Maximum Session Timeout = 3600
20061220:10.18.35.42: 00000061: :: STP: Default Output Encoding Retrieved from NLS
        Locale: wlatin1
20061220:10.18.35.42: 00000061: :: STP: Server Property: Default Output Encoding = wlatin1
20061220:10.18.35.42: 00000061: :: STP: Server Property: Default Session Cost = 1
20061220:10.18.35.42: 00000061: :: STP: Server Property: Default Context Cost = 100
20061220:10.18.35.42: 00000061:ERROR: Bridge protocol engine socket access method failed to
        bind listen socket, error 10048 (The specified address is already in use.).
20061220:10.18.35.42: 00000005: :: STP: Stored Process Server Shutting Down.
20061220:10.18.35.45: 00000005: :: STP: Stored Process Server Shutdown Complete.

NOTE: The SAS System used:
real time 5.18 seconds
cpu time 1.46 seconds

Note: These errors will occur because the server is still bound to the designated port that is trying to start. Note that the error messages will not identify the port number already in use, but rather, will refer only to the "specified address". If the stored process server log contains the statements above, you will need to stop the existing stored process servers and restart the object spawner.

For instructions, see the section "Step 5: Stop Existing Stored Process Servers and Restart the Object Spawner" below.

Checking the Metadata Server Log

Go to the appropriate folder and view the metadata server log file.

  • Windows

    Windows filename: MetadataServer#d#b#y.log

    Windows path: C:\SAS\<projectdir>\Lev1\SASMain\MetadataServer\logs

  • UNIX

    UNIX filename: MetadataServer#d#b#y.log

    UNIX path: <projectdir>/Lev1/SASMain/MetadataServer/logs

    Note: Normally, there are no errors logged by the metadata server when stored process servers become unresponsive. However, you should check this log as a precautionary measure.

Checking the Windows Event Viewer for Errors or Clues (Windows OS only)

The Windows Event Viewer is a Windows System Tool found in the Microsoft Management Console. Follow these tasks to check the Windows Event Viewer for errors:

  1. From the Windows Control Panel open Administrative Tools ► Computer Management.
  2. Expand System Tools then expand the Event Viewer tool.
  3. From Event Viewer, click on each log file and in the right-hand pane look for any SAS event or other activity that occurred during the same date and time as the unresponsive condition of the stored process servers.

Step 5: Stop Existing Stored Process Servers and Restart the Object Spawner

  1. Retrieve the listing of process IDs for unresponsive stored process servers that you noted after terminating the object spawner.
  2. Stop the unresponsive processes.
    • Windows

      Use Task Manager or the kill command to terminate the unresponsive stored process servers.

    • UNIX

      Use the kill command. See the example below:

      prompt> kill < process ID #>
  3. Restart the object spawner.
    • Windows

      Start the Object Spawner service through the Windows Services manager.

    • UNIX

      Locate your ObjectSpawner.sh file located in your SAS Configuration directory. For example:

      !SASROOT/BIArch/Lev1/SASMain/ObjectSpawner/ObjectSpawner.sh

      From a system prompt, submit

      prompt> ObjectSpawner.sh start
  4. Using SAS Management Console, test the connection to the stored process server to ensure that the servers were successfully restored.
  5. Clean up leftover WORK library files. These files accumulate as a result of abnormal termination of the stored process server sessions.
    • Windows

      Use the SAS Disk Cleanup Handler. For further information about this program see SAS Note 8786: SAS Disk Cleanup Handler replaces a SAS 9.1 program called cleanworkpc.sas to clean work directories on Windows systems.

    • UNIX

      Use the cleanwork tool. For further information, see the cleanwork Command documentation.

If the above steps do not restore the SAS Stored Process Servers to normal functionality, please compile results from the tests and the log files referenced in these steps and contact SAS Technical Support for further assistance.

If you are experiencing unresponsive stored process servers using SAS 9.2, please refer to the following notes:

Usage Note 43160: Tips for addressing unresponsive SAS® 9.2 Stored Process Servers, Part 1

Usage Note 43163: Tips for addressing unresponsive SAS® 9.2 Stored Process Servers, Part 2



Operating System and Release Information

Product FamilyProductSystemSAS Release
ReportedFixed*
SAS SystemSAS Integration Technologiesz/OS9.1 TS1M3
Microsoft® Windows® for 64-Bit Itanium-based Systems9.1 TS1M3
Microsoft Windows Server 2003 Datacenter 64-bit Edition9.1 TS1M3
Microsoft Windows Server 2003 Enterprise 64-bit Edition9.1 TS1M3
Microsoft Windows XP 64-bit Edition9.1 TS1M3
Microsoft® Windows® for x649.1 TS1M3
Microsoft Windows 2000 Advanced Server9.1 TS1M3
Microsoft Windows 2000 Datacenter Server9.1 TS1M3
Microsoft Windows 2000 Server9.1 TS1M3
Microsoft Windows 2000 Professional9.1 TS1M3
Microsoft Windows NT Workstation9.1 TS1M3
Microsoft Windows Server 2003 Datacenter Edition9.1 TS1M3
Microsoft Windows Server 2003 Enterprise Edition9.1 TS1M3
Microsoft Windows Server 2003 Standard Edition9.1 TS1M3
Microsoft Windows XP Professional9.1 TS1M3
Windows Vista9.1 TS1M3
64-bit Enabled AIX9.1 TS1M3
64-bit Enabled HP-UX9.1 TS1M3
64-bit Enabled Solaris9.1 TS1M3
HP-UX IPF9.1 TS1M3
Linux9.1 TS1M3
OpenVMS Alpha9.1 TS1M3
Tru64 UNIX9.1 TS1M3
* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.