dcavalca / rpms / mdadm

Forked from rpms/mdadm 3 years ago
Clone

Blame SOURCES/0012-imsm-finish-recovery-when-drive-with-rebuild-fails.patch

5eacff
From a4e96fd8f3f0b5416783237c1cb6ee87e7eff23d Mon Sep 17 00:00:00 2001
5eacff
From: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
5eacff
Date: Fri, 8 Feb 2019 11:07:10 +0100
5eacff
Subject: [RHEL7.7 PATCH 12/24] imsm: finish recovery when drive with rebuild
5eacff
 fails
5eacff
5eacff
Commit d7a1fda2769b ("imsm: update metadata correctly while raid10 double
5eacff
degradation") resolves main Imsm double degradation problems but it
5eacff
omits one case. Now metadata hangs in the rebuilding state if the drive
5eacff
under rebuild is removed during recovery from double degradation.
5eacff
5eacff
The root cause of this problem is comparing new map_state with current
5eacff
and if they both are degraded assuming that nothing new happens.
5eacff
5eacff
Don't rely on map states, just check if device is failed. If the drive
5eacff
under rebuild fails then finish migration, in other cases update map
5eacff
state only (second fail means that destination map state can't be normal).
5eacff
5eacff
To avoid problems with reassembling move end_migration (called after
5eacff
double degradation successful recovery) after check if recovery really
5eacff
finished, for details see (7ce057018 "imsm: fix: rebuild does not
5eacff
continue after reboot").
5eacff
Remove redundant code responsible for finishing rebuild process. Function
5eacff
end_migration do exactly the same. Set last_checkpoint to 0, to prepare
5eacff
it for the next rebuild.
5eacff
5eacff
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
5eacff
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
5eacff
---
5eacff
 super-intel.c | 26 +++++++++++---------------
5eacff
 1 file changed, 11 insertions(+), 15 deletions(-)
5eacff
5eacff
diff --git a/super-intel.c b/super-intel.c
5eacff
index d2035cc..38a1b6c 100644
5eacff
--- a/super-intel.c
5eacff
+++ b/super-intel.c
5eacff
@@ -8560,26 +8560,22 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
5eacff
 		}
5eacff
 		if (is_rebuilding(dev)) {
5eacff
 			dprintf_cont("while rebuilding ");
5eacff
-			if (map->map_state != map_state)  {
5eacff
-				dprintf_cont("map state change ");
5eacff
+			if (state & DS_FAULTY)  {
5eacff
+				dprintf_cont("removing failed drive ");
5eacff
 				if (n == map->failed_disk_num) {
5eacff
 					dprintf_cont("end migration");
5eacff
 					end_migration(dev, super, map_state);
5eacff
+					a->last_checkpoint = 0;
5eacff
 				} else {
5eacff
-					dprintf_cont("raid10 double degradation, map state change");
5eacff
+					dprintf_cont("fail detected during rebuild, changing map state");
5eacff
 					map->map_state = map_state;
5eacff
 				}
5eacff
 				super->updates_pending++;
5eacff
-			} else if (!rebuild_done)
5eacff
-				break;
5eacff
-			else if (n == map->failed_disk_num) {
5eacff
-				/* r10 double degraded to degraded transition */
5eacff
-				dprintf_cont("raid10 double degradation end migration");
5eacff
-				end_migration(dev, super, map_state);
5eacff
-				a->last_checkpoint = 0;
5eacff
-				super->updates_pending++;
5eacff
 			}
5eacff
 
5eacff
+			if (!rebuild_done)
5eacff
+				break;
5eacff
+
5eacff
 			/* check if recovery is really finished */
5eacff
 			for (mdi = a->info.devs; mdi ; mdi = mdi->next)
5eacff
 				if (mdi->recovery_start != MaxSector) {
5eacff
@@ -8588,7 +8584,7 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
5eacff
 				}
5eacff
 			if (recovery_not_finished) {
5eacff
 				dprintf_cont("\n");
5eacff
-				dprintf_cont("Rebuild has not finished yet, map state changes only if raid10 double degradation happens");
5eacff
+				dprintf_cont("Rebuild has not finished yet");
5eacff
 				if (a->last_checkpoint < mdi->recovery_start) {
5eacff
 					a->last_checkpoint =
5eacff
 						mdi->recovery_start;
5eacff
@@ -8598,9 +8594,9 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
5eacff
 			}
5eacff
 
5eacff
 			dprintf_cont(" Rebuild done, still degraded");
5eacff
-			dev->vol.migr_state = 0;
5eacff
-			set_migr_type(dev, 0);
5eacff
-			dev->vol.curr_migr_unit = 0;
5eacff
+			end_migration(dev, super, map_state);
5eacff
+			a->last_checkpoint = 0;
5eacff
+			super->updates_pending++;
5eacff
 
5eacff
 			for (i = 0; i < map->num_members; i++) {
5eacff
 				int idx = get_imsm_ord_tbl_ent(dev, i, MAP_0);
5eacff
-- 
5eacff
2.7.5
5eacff