Blame SOURCES/0019-mdmon-wait-for-previous-mdmon-to-exit-during-takeove.patch

8fbece
From d2e11da4b7fd0453e942f43e4196dc63b3dbd708 Mon Sep 17 00:00:00 2001
8fbece
From: Pawel Baldysiak <pawel.baldysiak@intel.com>
8fbece
Date: Fri, 22 Feb 2019 13:30:27 +0100
8fbece
Subject: [RHEL7.7 PATCH 19/24] mdmon: wait for previous mdmon to exit during
8fbece
 takeover
8fbece
8fbece
Since the patch c76242c5("mdmon: get safe mode delay file descriptor
8fbece
early"), safe_mode_dalay is set properly by initrd mdmon.  But in some
8fbece
cases with filesystem traffic since the very start of the system, it
8fbece
might take a while to transit to clean state.  Due to fact that new
8fbece
mdmon does not wait for the old one to exit - it might happen that the
8fbece
new one switches safe_mode_delay back to seconds, before old one exits.
8fbece
As the result two mdmons are running concurrently on same array.
8fbece
8fbece
Wait for the old mdmon to exit by pinging it with SIGUSR1 signal, just
8fbece
in case it is sleeping.
8fbece
8fbece
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
8fbece
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
8fbece
---
8fbece
 mdmon.c | 14 +++++++++++---
8fbece
 1 file changed, 11 insertions(+), 3 deletions(-)
8fbece
8fbece
diff --git a/mdmon.c b/mdmon.c
8fbece
index 0955fcc..ff985d2 100644
8fbece
--- a/mdmon.c
8fbece
+++ b/mdmon.c
8fbece
@@ -171,6 +171,7 @@ static void try_kill_monitor(pid_t pid, char *devname, int sock)
8fbece
 	int fd;
8fbece
 	int n;
8fbece
 	long fl;
8fbece
+	int rv;
8fbece
 
8fbece
 	/* first rule of survival... don't off yourself */
8fbece
 	if (pid == getpid())
8fbece
@@ -201,9 +202,16 @@ static void try_kill_monitor(pid_t pid, char *devname, int sock)
8fbece
 	fl &= ~O_NONBLOCK;
8fbece
 	fcntl(sock, F_SETFL, fl);
8fbece
 	n = read(sock, buf, 100);
8fbece
-	/* Ignore result, it is just the wait that
8fbece
-	 * matters
8fbece
-	 */
8fbece
+
8fbece
+	/* If there is I/O going on it might took some time to get to
8fbece
+	 * clean state. Wait for monitor to exit fully to avoid races.
8fbece
+	 * Ping it with SIGUSR1 in case that it is sleeping  */
8fbece
+	for (n = 0; n < 25; n++) {
8fbece
+		rv = kill(pid, SIGUSR1);
8fbece
+		if (rv < 0)
8fbece
+			break;
8fbece
+		usleep(200000);
8fbece
+	}
8fbece
 }
8fbece
 
8fbece
 void remove_pidfile(char *devname)
8fbece
-- 
8fbece
2.7.5
8fbece