|
|
b4b3ce |
From 9685e8e6bf2896377a9cf0e07a85de5dd5fcf2df Mon Sep 17 00:00:00 2001
|
|
|
b4b3ce |
From: Michele Baldessari <michele@acksyn.org>
|
|
|
b4b3ce |
Date: Wed, 12 Jun 2019 12:00:31 +0200
|
|
|
b4b3ce |
Subject: [PATCH] Simplify podman_monitor()
|
|
|
b4b3ce |
|
|
|
b4b3ce |
Before this change podman_monitor() does two things:
|
|
|
b4b3ce |
\-> podman_simple_status()
|
|
|
b4b3ce |
\-> podman inspect {{.State.Running}}
|
|
|
b4b3ce |
\-> if podman_simple_status == 0 then monitor_cmd_exec()
|
|
|
b4b3ce |
\-> if [ -z "$OCF_RESKEY_monitor_cmd" ]; then # so if OCF_RESKEY_monitor_cmd is empty we just return SUCCESS
|
|
|
b4b3ce |
return $rc
|
|
|
b4b3ce |
fi
|
|
|
b4b3ce |
# if OCF_RESKEY_monitor_cmd is set to something we execute it
|
|
|
b4b3ce |
podman exec ${CONTAINER} $OCF_RESKEY_monitor_cmd
|
|
|
b4b3ce |
|
|
|
b4b3ce |
Let's actually only rely on podman exec as invoked inside monitor_cmd_exec
|
|
|
b4b3ce |
when $OCF_RESKEY_monitor_cmd is non empty (which is the default as it is set to "/bin/true").
|
|
|
b4b3ce |
When there is no monitor_cmd command defined then it makes sense to rely on podman inspect
|
|
|
b4b3ce |
calls container in podman_simple_status().
|
|
|
b4b3ce |
|
|
|
b4b3ce |
Tested as follows:
|
|
|
b4b3ce |
1) Injected the change on an existing bundle-based cluster
|
|
|
b4b3ce |
2) Observed that monitoring operations kept working okay
|
|
|
b4b3ce |
3) Restarted rabbitmq-bundle and galera-bundle successfully
|
|
|
b4b3ce |
4) Killed a container and we correctly detected the monitor failure
|
|
|
b4b3ce |
Jun 12 09:52:12 controller-0 pacemaker-controld[25747]: notice: controller-0-haproxy-bundle-podman-1_monitor_60000:230 [ ocf-exit-reason:monitor cmd failed (rc=125), output: cannot exec into container that is not running\n ]
|
|
|
b4b3ce |
5) Container correctly got restarted after the monitor failure:
|
|
|
b4b3ce |
haproxy-bundle-podman-1 (ocf::heartbeat:podman): Started controller-0
|
|
|
b4b3ce |
6) Stopped and removed a container and pcmk detected it correctly:
|
|
|
b4b3ce |
Jun 12 09:55:15 controller-0 podman(haproxy-bundle-podman-1)[841411]: ERROR: monitor cmd failed (rc=125), output: unable to exec into haproxy-bundle-podman-1: no container with name or ID haproxy-bundle-podman-1 found: no such container
|
|
|
b4b3ce |
Jun 12 09:55:15 controller-0 pacemaker-execd[25744]: notice: haproxy-bundle-podman-1_monitor_60000:841411:stderr [ ocf-exit-reason:monitor cmd failed (rc=125), output: unable to exec into haproxy-bundle-podman-1: no container with name or ID haproxy-bundle-podman-1 found: no such container ]
|
|
|
b4b3ce |
7) pcmk was able to start the container that was stopped and removed:
|
|
|
b4b3ce |
Jun 12 09:55:16 controller-0 pacemaker-controld[25747]: notice: Result of start operation for haproxy-bundle-podman-1 on controller-0: 0 (ok)
|
|
|
b4b3ce |
8) Added 'set -x' to the RA and correctly observed that no 'podman inspect' has been invoked during monitoring operations
|
|
|
b4b3ce |
|
|
|
b4b3ce |
Signed-off-by: Michele Baldessari <michele@acksyn.org>
|
|
|
b4b3ce |
---
|
|
|
b4b3ce |
heartbeat/podman | 11 +++--------
|
|
|
b4b3ce |
1 file changed, 3 insertions(+), 8 deletions(-)
|
|
|
b4b3ce |
|
|
|
b4b3ce |
diff --git a/heartbeat/podman b/heartbeat/podman
|
|
|
b4b3ce |
index b2b3081f9..a9bd57dea 100755
|
|
|
b4b3ce |
--- a/heartbeat/podman
|
|
|
b4b3ce |
+++ b/heartbeat/podman
|
|
|
b4b3ce |
@@ -255,15 +255,10 @@ podman_simple_status()
|
|
|
b4b3ce |
|
|
|
b4b3ce |
podman_monitor()
|
|
|
b4b3ce |
{
|
|
|
b4b3ce |
- local rc=0
|
|
|
b4b3ce |
-
|
|
|
b4b3ce |
- podman_simple_status
|
|
|
b4b3ce |
- rc=$?
|
|
|
b4b3ce |
-
|
|
|
b4b3ce |
- if [ $rc -ne 0 ]; then
|
|
|
b4b3ce |
- return $rc
|
|
|
b4b3ce |
+ if [ -z "$OCF_RESKEY_monitor_cmd" ]; then
|
|
|
b4b3ce |
+ podman_simple_status
|
|
|
b4b3ce |
+ return $?
|
|
|
b4b3ce |
fi
|
|
|
b4b3ce |
-
|
|
|
b4b3ce |
monitor_cmd_exec
|
|
|
b4b3ce |
}
|
|
|
b4b3ce |
|