peterdelevoryas / rpms / qemu

Forked from rpms/qemu 2 years ago
Clone

Blame 0007-mirror-Fix-coroutine-reentrance.patch

b448bf
From: Kevin Wolf <kwolf@redhat.com>
b448bf
Date: Thu, 13 Aug 2015 10:41:50 +0200
b448bf
Subject: [PATCH] mirror: Fix coroutine reentrance
b448bf
b448bf
This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero
b448bf
write on target if sectors not allocated"), which was reported to cause
b448bf
aborts with the message "Co-routine re-entered recursively".
b448bf
b448bf
The cause for this bug is the following code in mirror_iteration_done():
b448bf
b448bf
    if (s->common.busy) {
b448bf
        qemu_coroutine_enter(s->common.co, NULL);
b448bf
    }
b448bf
b448bf
This has always been ugly because - unlike most places that reenter - it
b448bf
doesn't have a specific yield that it pairs with, but is more
b448bf
uncontrolled.  What we really mean here is "reenter the coroutine if
b448bf
it's in one of the four explicit yields in mirror.c".
b448bf
b448bf
This used to be equivalent with s->common.busy because neither
b448bf
mirror_run() nor mirror_iteration() call any function that could yield.
b448bf
However since commit dcfb3beb this doesn't hold true any more:
b448bf
bdrv_get_block_status_above() can yield.
b448bf
b448bf
So what happens is that bdrv_get_block_status_above() wants to take a
b448bf
lock that is already held, so it adds itself to the queue of waiting
b448bf
coroutines and yields. Instead of being woken up by the unlock function,
b448bf
however, it gets woken up by mirror_iteration_done(), which is obviously
b448bf
wrong.
b448bf
b448bf
In most cases the code actually happens to cope fairly well with such
b448bf
cases, but in this specific case, the unlock must already have scheduled
b448bf
the coroutine for wakeup when mirror_iteration_done() reentered it. And
b448bf
then the coroutine happened to process the scheduled restarts and tried
b448bf
to reenter itself recursively.
b448bf
b448bf
This patch fixes the problem by pairing the reenter in
b448bf
mirror_iteration_done() with specific yields instead of abusing
b448bf
s->common.busy.
b448bf
b448bf
Cc: qemu-stable@nongnu.org
b448bf
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
b448bf
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
b448bf
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
b448bf
Reviewed-by: Jeff Cody <jcody@redhat.com>
b448bf
Message-id: 1439455310-11263-1-git-send-email-kwolf@redhat.com
b448bf
Signed-off-by: Jeff Cody <jcody@redhat.com>
b448bf
(cherry picked from commit e424aff5f307227b1c2512bbb8ece891bb895cef)
b448bf
---
b448bf
 block/mirror.c | 15 ++++++++++-----
b448bf
 1 file changed, 10 insertions(+), 5 deletions(-)
b448bf
b448bf
diff --git a/block/mirror.c b/block/mirror.c
b448bf
index fc4d8f5..b2fb4b9 100644
b448bf
--- a/block/mirror.c
b448bf
+++ b/block/mirror.c
b448bf
@@ -60,6 +60,7 @@ typedef struct MirrorBlockJob {
b448bf
     int sectors_in_flight;
b448bf
     int ret;
b448bf
     bool unmap;
b448bf
+    bool waiting_for_io;
b448bf
 } MirrorBlockJob;
b448bf
 
b448bf
 typedef struct MirrorOp {
b448bf
@@ -114,11 +115,7 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
b448bf
     qemu_iovec_destroy(&op->qiov);
b448bf
     g_slice_free(MirrorOp, op);
b448bf
 
b448bf
-    /* Enter coroutine when it is not sleeping.  The coroutine sleeps to
b448bf
-     * rate-limit itself.  The coroutine will eventually resume since there is
b448bf
-     * a sleep timeout so don't wake it early.
b448bf
-     */
b448bf
-    if (s->common.busy) {
b448bf
+    if (s->waiting_for_io) {
b448bf
         qemu_coroutine_enter(s->common.co, NULL);
b448bf
     }
b448bf
 }
b448bf
@@ -203,7 +200,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
b448bf
     /* Wait for I/O to this cluster (from a previous iteration) to be done.  */
b448bf
     while (test_bit(next_chunk, s->in_flight_bitmap)) {
b448bf
         trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
b448bf
+        s->waiting_for_io = true;
b448bf
         qemu_coroutine_yield();
b448bf
+        s->waiting_for_io = false;
b448bf
     }
b448bf
 
b448bf
     do {
b448bf
@@ -239,7 +238,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
b448bf
          */
b448bf
         while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
b448bf
             trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
b448bf
+            s->waiting_for_io = true;
b448bf
             qemu_coroutine_yield();
b448bf
+            s->waiting_for_io = false;
b448bf
         }
b448bf
         if (s->buf_free_count < nb_chunks + added_chunks) {
b448bf
             trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
b448bf
@@ -333,7 +334,9 @@ static void mirror_free_init(MirrorBlockJob *s)
b448bf
 static void mirror_drain(MirrorBlockJob *s)
b448bf
 {
b448bf
     while (s->in_flight > 0) {
b448bf
+        s->waiting_for_io = true;
b448bf
         qemu_coroutine_yield();
b448bf
+        s->waiting_for_io = false;
b448bf
     }
b448bf
 }
b448bf
 
b448bf
@@ -506,7 +509,9 @@ static void coroutine_fn mirror_run(void *opaque)
b448bf
             if (s->in_flight == MAX_IN_FLIGHT || s->buf_free_count == 0 ||
b448bf
                 (cnt == 0 && s->in_flight > 0)) {
b448bf
                 trace_mirror_yield(s, s->in_flight, s->buf_free_count, cnt);
b448bf
+                s->waiting_for_io = true;
b448bf
                 qemu_coroutine_yield();
b448bf
+                s->waiting_for_io = false;
b448bf
                 continue;
b448bf
             } else if (cnt != 0) {
b448bf
                 delay_ns = mirror_iteration(s);