SUPPORT / SAMPLES & SAS NOTES
 

Support

Problem Note 58166: A Linux kernel FUTEX_WAIT() problem might cause the SAS® 9.4 Web Application Server to stop responding

DetailsAboutRate It

The SAS 9.4 Web Application Server intermittently stops responding when you use SAS® 9.4 Enterprise BI Server in a Linux operating environment. When this problem occurs, you might experience various symptoms, including the following:

  • The front-end clients cannot be authenticated when you access web applications that are deployed in SAS 9.4 Enterprise BI Server. Specifically, the logon form—where the client provides user-name and password credentials—is not displayed.
  • Application logging from the unresponsive SAS Web Application Server suddenly stops.
  • The lack of response in the SAS Web Application Server can be resolved either by restarting that web application server (for example, Server1_1) or by restarting the entire SAS® middle tier.
  • Occasionally, the SAS Web Application Server with the deployed-applications process begins to respond again.

This problem might affect SAS 9.4 Enterprise BI Server on a Red Hat Enterprise Linux Server 6.7 that has the following Linux kernel version:

Linux server_hostname 2.6.32-504.8.1.el6.x86_64 #1 SMP Fri Dec 19 12:09:25 EST 2014 x86_64 x86_64 x86_64 GNU/Linux

However, this problem does not only affect Red Hat Enterprise Linux Server 6.7. It might also affect other versions of Red Hat Enterprise Linux as well as Oracle Linux and SUSE Linux.

This problem is caused by a Linux kernel FUTEX_WAIT() problem. For additional information about the problem, see futex: Ensure get_futex_key_refs() always implies a barrier.

The following sections explain the impact of this problem, how to determine whether you are experiencing this particular issue, and how to resolve the problem.

Impact of This Problem

This problem can be quite serious because it can lead to outages of various applications (such as SAS Web Application Server) and processes (such as a Java Virtual Machine [JVM] process). These applications and processes might become unresponsive and appear to be deadlocked in seemingly impossible situations. A FUTEX_WAIT() call, and any processes that make that call, can remain blocked for a very long time or even indefinitely. The impact is not limited to Java processes that SAS 9.4 Enterprise BI Server uses. This problem can affect any running process that uses a FUTEX_WAIT() call under Linux.

The most likely place where you will notice impact from this Linux problem is when you use the Logon Manager with release 9.2, 9.3, and 9.4 SAS Enterprise BI Server. By default, web applications use the form-based authentication that Logon Manager uses. When Logon Manager processes credentials from the front-end client, those credentials are also sent to the SAS ® Metadata Server and the back-end Lightweight Directory Access Protocol (LDAP) server (if the user registry is configured) for authentication. Login Manager is deployed on the Server1_1 SAS Web Application Server. If the Linux kernel FUTEX_WAIT() problem affects this server, front-end user authentication fails for any SAS product or solutions (for example, SAS® Visual Analytics) that use releases 9.2, 9.3, and 9.4 SAS Enterprise BI Server.

Determining That Your SAS 9.4 Web Application Server Has This Problem

You can confirm that you have the problem that is described above if any of the following are true:

  • The SAS Web Application Server Java process is working (that is, it has not failed).
  • The static Welcome page of SAS Web Application Server is unresponsive.
  • Both the application log and the SAS Web Application Server log are not updated.
  • Requests for SAS Web Application Server thread dumps do not produce thread dumps.
  • The unresponsiveness suddenly, randomly, and unpredictably disappears. For example, the static Welcome page of SAS Web Application Server is unresponsive, but then it suddenly becomes responsive again.

You can also monitor and trace the unresponsive JVM process by using the strace utility for interactions between the unresponsive JVM process and the Linux kernel. If you experience the Linux kernel problem, the trace will show many threads are in a FUTEX_WAIT state, as shown in this example:

. . .many FUTEX_WAIT_PRIVATE calls. . . 12863 15:02:03 futex(0x2b5ca0024144, FUTEX_WAIT_PRIVATE, 31, NULL <unfinished…> 12862 15:02:03 futex(0x2b5bc40ec244, FUTEX_WAIT_PRIVATE, 29, NULL <unfinished…> . . .many similar FUTEX_WAIT_PRIVATE calls. . .

Solution

To avoid this problem, upgrade to a Linux kernel version that includes a fix for the problem for the brand and release of Linux Server that you have.

For example, this release of the Linux kernel for RHEL 6.7 contains the fix:

Linux server_hostname 2.6.32-573.22.1.el6.x86_64 #1 SMP Thu Mar 17 03:23:39 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux


Operating System and Release Information

Product FamilyProductSystemProduct ReleaseSAS Release
ReportedFixed*ReportedFixed*
SAS SystemBI Server TierLinux for x649.49.4 TS1M0
* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.