Problem Note 70193: Failover for the sas-svi-alert service might fail in SAS® Viya® 3.5 environments with RabbitMQ high availability (HA)
Your sas-svi-alert service might fail to fail over to an available node under the following conditions:
- Your SAS Viya environment is configured with a high availability (HA) setup for RabbitMQ across multiple nodes.
- You shut down one of the nodes or one of the nodes becomes unavailable.
In this scenario, the service might have a status of not ready because it continues to try to connect to a node that is no longer available. The log might show that the service is trying to connect to an unavailable node. For example, in the log entry below, the “rabbitmq.node1.sas.com” host name represents a node that is unavailable:
2023-06-05 12:14:32.248 WARN 32183 --- [ssageListener-2] o.s.amqp.rabbit.core.RabbitAdmin : service Failed to declare queue: Queue [name=svi.tdc.ae.error.q, durable=true, autoDelete=false, exclusive=false, arguments={}, actualName=svi.tdc.ae.error.q], continuing... com.rabbitmq.client.ShutdownSignalException: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - home node 'rabbit@<rabbitmq.node1.sas.com>' of durable queue 'svi.tdc.ae.error.q' in vhost '/' is down or inaccessible, class-id=50, method-id=10)
In this scenario, the failover fails due to enablement of some of the old RabbitMQ queues that do not support platform failover.
To resolve the issue, use the following commands to disable those queues and restart the sas-svi-alert service and RabbitMQ nodes, as needed, to confirm the service failover is working:
export CONSUL_HTTP_TOKEN=$(sudo cat /opt/sas/viya/config/etc/SASSecurityCertificateFramework/tokens/consul/default/client.token)
source /opt/sas/viya/config/consul.conf
/opt/sas/viya/home/bin/sas-bootstrap-config kv write --force config/svi-alert/amqp.ae-listener-enabled "false"
Operating System and Release Information
SAS System | SAS Visual Investigator (on SAS Viya 3.x) | Linux for x64 | 10.8 | | Viya | |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
Type: | Problem Note |
Priority: | medium |
Date Modified: | 2023-06-23 16:11:44 |
Date Created: | 2023-06-19 16:45:41 |