Blob Blame History Raw
xfs_repair: fix progress reporting

RHEL7 note: this is a simplified version of the upstream patch.

The Fixes: commit tried to avoid a segfault in case the progress timer
went off before the first message type had been set up, but this
had the net effect of short-circuiting the pthread start routine,
and so the timer didn't get set up at all and we lost all fine-grained
progress reporting.

The initial problem occurred when log zeroing took more time than the
timer interval.

[RHEL7: we do not explicitly track log zeroing, and instead lump it into
Phase 2, so the next part of the description is not relevant to this
patch.]

So, make a new log zeroing progress item and initialize it when we first
set up the timer thread, to be sure that if the timer goes off while we
are still zeroing the log, it will be initialized and correct.

(We can't offer fine-grained status on log zeroing, so it'll go from
zero to $LOGBLOCKS with nothing in between, but it's unlikely that log
zeroing will take so long that this really matters.)

Reported-by: Leonardo Vaz <lvaz@redhat.com>
Fixes: 7f2d6b811755 ("xfs_repair: avoid segfault if reporting progre...")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

It might be nice to add progress reporting for the log zeroing, but that
requires renumbering all these macros, and we don't/can't actually get
any fine-grained progress at all, so probably not worth it.

Index: xfsprogs-4.5.0/repair/progress.c
===================================================================
--- xfsprogs-4.5.0.orig/repair/progress.c
+++ xfsprogs-4.5.0/repair/progress.c
@@ -124,7 +124,13 @@ init_progress_rpt (void)
 	 */
 
 	pthread_mutex_init(&global_msgs.mutex, NULL);
-	global_msgs.format = NULL;
+	/*
+	 * Ensure the format string is not NULL in case the first timer
+	 * goes off before any stage calls set_progress_msg() to set it.
+	 * Choosing PROG_FMT_SCAN_AG lumps log zeroing in with the AG scan
+	 * but that is not expected to take inordinately long.
+	 */
+	global_msgs.format = &progress_rpt_reports[PROG_FMT_SCAN_AG];
 	global_msgs.count = glob_agcount;
 	global_msgs.interval = report_interval;
 	global_msgs.done   = prog_rpt_done;
@@ -170,10 +176,6 @@ progress_rpt_thread (void *p)
 	msg_block_t *msgp = (msg_block_t *)p;
 	__uint64_t percent;
 
-	/* It's possible to get here very early w/ no progress msg set */
-	if (!msgp->format)
-		return NULL;
-
 	if ((msgbuf = (char *)malloc(DURATION_BUF_SIZE)) == NULL)
 		do_error (_("progress_rpt: cannot malloc progress msg buffer\n"));