Problem Note 54066: SAS 9.4® ActiveMQ JMS Broker services randomly fail and generate errors
The SAS 9.4 ActiveMQ JMS Broker services randomly fail. When this happens, you receive messages in the broker-services log file (activemq.log) showing that the . . ./kahadb/lock directory is locked, as shown below.
2014-07-15 14:13:59,005 | INFO | Database /softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File '/softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock' could not be locked. | org.apache.activemq.store.SharedFileLocker | main
2014-07-15 14:14:09,006 | INFO | Database /softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File '/softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock' could not be locked. | org.apache.activemq.store.SharedFileLocker | main
2014-07-15 14:14:19,007 | INFO | Database /softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File '/softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock' could not be locked. | org.apache.activemq.store.SharedFileLocker | main
2014-07-15 14:14:29,007 | INFO | Database /softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File '/softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock' could not be locked. | org.apache.activemq.store.SharedFileLocker | main
2014-07-15 14:14:39,008 | INFO | Database /softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File '/softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock' could not be locked. | org.apache.activemq.store.SharedFileLocker | main
2014-07-15 14:14:49,009 | INFO | Database /softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File '/softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock' could not be locked. | org.apache.activemq.store.SharedFileLocker | main
2014-07-15 14:14:59,009 | INFO | Database /softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File '/softwares/sas/config/Lev1/Web/activemq/data/kahadb/lock' could not be locked. | org.apache.activemq.store.SharedFileLocker | main
2014-07-15 17:44:59,650 | WARN | Transport Connection to: tcp://203.211.129.192:54085 failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///203.211.129.192:54085@61616
2014-07-15 17:44:59,650 | WARN | Transport Connection to: tcp://203.211.129.192:54106 failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///203.211.129.192:54106@61616
2014-07-15 17:44:59,650 | WARN | Transport Connection to: tcp://203.211.129.192:55533 failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///203.211.129.192:55533@61616
2014-07-15 17:45:01,527 | WARN | Transport Connection to: tcp://203.211.129.192:53689 failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///203.211.129.192:53689@61616
2014-07-15 17:45:01,528 | WARN | Transport Connection to: tcp://203.211.129.192:53720 failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///203.211.129.192:53720@61616
In addition, the following transport-connection EOF exceptions are generated in the log file (server.log) for the SASServer1_1 instance of SAS 9.4® Web Application Server:
2014-07-10 11:06:35,441 [Timer-48] WARN [unknown] com.sas.svcs.scheduling.server.impl.ScheduleManager - SAS Distributed In-Process Scheduling exception occurred.
com.atomikos.jms.AtomikosJMSException: Failed to grow the connection pool
at com.atomikos.jms.AtomikosJMSException.throwAtomikosJMSException(AtomikosJMSException.java:55)
at com.atomikos.jms.AtomikosConnectionFactoryBean.throwAtomikosJMSException(AtomikosConnectionFactoryBean.java:174)
at com.atomikos.jms.AtomikosConnectionFactoryBean.createConnection(AtomikosConnectionFactoryBean.java:593)
at com.sas.scheduler.api.servers.ip.engine.mq.JMSClusterSupport.initJMSTopic(JMSClusterSupport.java:240)
at com.sas.scheduler.api.servers.ip.engine.mq.JMSClusterSupport.safeCreateMapMessage(JMSClusterSupport.java:1613)
at com.sas.scheduler.api.servers.ip.engine.mq.JMSClusterSupport.startMasterDetermination(JMSClusterSupport.java:398)
at com.sas.scheduler.api.servers.ip.engine.mq.JMSClusterSupport$2.run(JMSClusterSupport.java:544)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
Caused by: com.atomikos.datasource.pool.CreateConnectionException: error creating JMS connection
at com.atomikos.jms.AtomikosJmsXAConnectionFactory.createPooledConnection(AtomikosJmsXAConnectionFactory.java:61)
at com.atomikos.datasource.pool.ConnectionPool.growPool(ConnectionPool.java:193)
at com.atomikos.datasource.pool.ConnectionPool.borrowConnection(ConnectionPool.java:146)
at com.atomikos.jms.AtomikosConnectionFactoryBean.createConnection(AtomikosConnectionFactoryBean.java:591)
. . .6 additional lines. . .
Caused by: javax.jms.JMSException: Could not connect to broker URL: tcp://sollersva:61616. Reason: java.net.ConnectException: Connection refused
at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:35)
at org.apache.activemq.ActiveMQConnectionFactory.createActiveMQConnection(ActiveMQConnectionFactory.java:293)
at org.apache.activemq.ActiveMQConnectionFactory.createActiveMQConnection(ActiveMQConnectionFactory.java:238)
at org.apache.activemq.ActiveMQXAConnectionFactory.createXAConnection(ActiveMQXAConnectionFactory.java:59)
at com.atomikos.jms.AtomikosJmsXAConnectionFactory.createPooledConnection(AtomikosJmsXAConnectionFactory.java:58)
. . .9 additional lines. . .
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
at java.net.Socket.connect(Socket.java:579)
at org.apache.activemq.transport.tcp.TcpTransport.connect(TcpTransport.java:504)
at org.apache.activemq.transport.tcp.TcpTransport.doStart(TcpTransport.java:467)
at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)
at org.apache.activemq.transport.AbstractInactivityMonitor.start(AbstractInactivityMonitor.java:132)
at org.apache.activemq.transport.TransportFilter.start(TransportFilter.java:58)
at org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiator.java:72)
at org.apache.activemq.transport.TransportFilter.start(TransportFilter.java:58)
at org.apache.activemq.transport.TransportFilter.start(TransportFilter.java:58)
at org.apache.activemq.ActiveMQConnectionFactory.createActiveMQConnection(ActiveMQConnectionFactory.java:273)
To solve this problem, follow these steps on the middle-tier machine:
- Stop all of the SASServer instances (for example, SASServer1_1, SASServerN_1, and so on).
- Stop the SAS 9.4 ActiveMQ JMS Broker services.
- Implement the suggested changes in SAS KB0036267, "Increasing numbers of Apache ActiveMQ journal log files in certain SAS® 9.4 Enterprise BI and SAS® Visual Analytics environments fill up disk space."
- Restart the SAS 9.4 ActiveMQ JMS Broker services. Check the activemq.log file to make sure that the services start properly.
- Restart all of the SASServer instances (SASServer1_1, SASServerN_1, and so on).
Operating System and Release Information
SAS System | SAS Enterprise BI Server | Microsoft® Windows® for x64 | 9.4 | 9.4 | 9.4 TS1M1 | 9.4 TS1M2 |
Linux for x64 | 9.4 | 9.4 | 9.4 TS1M1 | 9.4 TS1M2 |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
Type: | Problem Note |
Priority: | medium |
Date Modified: | 2014-09-12 13:48:49 |
Date Created: | 2014-09-08 13:26:42 |