Blob Blame History Raw
From 47d75f8de9dc912da035805f141c674885ce432f Mon Sep 17 00:00:00 2001
From: John Eckersberg <jeckersb@redhat.com>
Date: Thu, 16 Jan 2020 10:20:59 -0500
Subject: [PATCH] rabbitmq-cluster: ensure we delete nodename if stop action
 fails

If the stop action fails, we want to remove the nodename from the crm
attribute.  Currently it is possible for the stop action to fail but
the rabbitmq server does actually stop.  This leaves the attribute
still present.  This means if the entire rabbitmq cluster is stopped,
it is not possible to start the cluster again because the first node
to start will think there is at least one other node running.  Then
the node tries to join an existing cluster instead of rebootstrapping
the cluster from a single node.
---
 heartbeat/rabbitmq-cluster | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/heartbeat/rabbitmq-cluster b/heartbeat/rabbitmq-cluster
index 7837e9e3c..a9ebd37ad 100755
--- a/heartbeat/rabbitmq-cluster
+++ b/heartbeat/rabbitmq-cluster
@@ -552,6 +552,7 @@ rmq_stop() {
 
 	if [ $rc -ne 0 ]; then
 		ocf_log err "rabbitmq-server stop command failed: $RMQ_CTL stop, $rc"
+		rmq_delete_nodename
 		return $rc
 	fi
 
@@ -565,6 +566,7 @@ rmq_stop() {
 			break
 		elif [ "$rc" -ne $OCF_SUCCESS ]; then
 			ocf_log info "rabbitmq-server stop failed: $rc"
+			rmq_delete_nodename
 			exit $OCF_ERR_GENERIC
 		fi
 		sleep 1