|
|
a89620 |
From 19ee29342f8bb573722991b8cbe4503309ad0bf9 Mon Sep 17 00:00:00 2001
|
|
|
a89620 |
From: John Eckersberg <jeckersb@redhat.com>
|
|
|
a89620 |
Date: Fri, 2 Nov 2018 13:12:53 -0400
|
|
|
a89620 |
Subject: [PATCH] rabbitmq-cluster: fix regression in rmq_stop
|
|
|
a89620 |
|
|
|
a89620 |
This regression was introduced in PR#1249 (cc23c55). The stop action
|
|
|
a89620 |
was modified to use rmq_app_running in order to check the service
|
|
|
a89620 |
status, which allows for the following sequence of events:
|
|
|
a89620 |
|
|
|
a89620 |
- service is started, unclustered
|
|
|
a89620 |
- stop_app is called
|
|
|
a89620 |
- cluster_join is attempted and fails
|
|
|
a89620 |
- stop is called
|
|
|
a89620 |
|
|
|
a89620 |
Because stop_app was called, rmq_app_running returns $OCF_NOT_RUNNING
|
|
|
a89620 |
and the stop action is a no-op. This means the erlang VM continues
|
|
|
a89620 |
running.
|
|
|
a89620 |
|
|
|
a89620 |
When the start action is attempted again, a new erlang VM is launched,
|
|
|
a89620 |
but this VM fails to boot because the old one is still running and is
|
|
|
a89620 |
registered with the same name (rabbit@nodename).
|
|
|
a89620 |
|
|
|
a89620 |
This adds a new function, rmq_node_alive, which does a simple eval to
|
|
|
a89620 |
test whether the erlang VM is up, independent of the rabbit app. The
|
|
|
a89620 |
stop action now uses rmq_node_alive to check the service status, so
|
|
|
a89620 |
even if stop_app was previously called, the erlang VM will be stopped
|
|
|
a89620 |
properly.
|
|
|
a89620 |
|
|
|
a89620 |
Resolves: RHBZ#1639826
|
|
|
a89620 |
---
|
|
|
a89620 |
heartbeat/rabbitmq-cluster | 12 +++++++++++-
|
|
|
a89620 |
1 file changed, 11 insertions(+), 1 deletion(-)
|
|
|
a89620 |
|
|
|
a89620 |
diff --git a/heartbeat/rabbitmq-cluster b/heartbeat/rabbitmq-cluster
|
|
|
a89620 |
index 78b2bbadf..a2de9dc20 100755
|
|
|
a89620 |
--- a/heartbeat/rabbitmq-cluster
|
|
|
a89620 |
+++ b/heartbeat/rabbitmq-cluster
|
|
|
a89620 |
@@ -188,6 +188,16 @@ rmq_app_running() {
|
|
|
a89620 |
fi
|
|
|
a89620 |
}
|
|
|
a89620 |
|
|
|
a89620 |
+rmq_node_alive() {
|
|
|
a89620 |
+ if $RMQ_CTL eval 'ok.'; then
|
|
|
a89620 |
+ ocf_log debug "RabbitMQ node is alive"
|
|
|
a89620 |
+ return $OCF_SUCCESS
|
|
|
a89620 |
+ else
|
|
|
a89620 |
+ ocf_log debug "RabbitMQ node is down"
|
|
|
a89620 |
+ return $OCF_NOT_RUNNING
|
|
|
a89620 |
+ fi
|
|
|
a89620 |
+}
|
|
|
a89620 |
+
|
|
|
a89620 |
rmq_monitor() {
|
|
|
a89620 |
local rc
|
|
|
a89620 |
|
|
|
a89620 |
@@ -514,7 +524,7 @@ rmq_stop() {
|
|
|
a89620 |
end.
|
|
|
a89620 |
"
|
|
|
a89620 |
|
|
|
a89620 |
- rmq_app_running
|
|
|
a89620 |
+ rmq_node_alive
|
|
|
a89620 |
if [ $? -eq $OCF_NOT_RUNNING ]; then
|
|
|
a89620 |
return $OCF_SUCCESS
|
|
|
a89620 |
fi
|