4841a6
From 6348063b91b2370cc27153fd58fd11a6681631f6 Mon Sep 17 00:00:00 2001
4841a6
From: Hanna Reitz <hreitz@redhat.com>
4841a6
Date: Wed, 16 Feb 2022 11:53:53 +0100
4841a6
Subject: [PATCH 22/24] block: Make bdrv_refresh_limits() non-recursive
4841a6
MIME-Version: 1.0
4841a6
Content-Type: text/plain; charset=UTF-8
4841a6
Content-Transfer-Encoding: 8bit
4841a6
4841a6
RH-Author: Hanna Reitz <hreitz@redhat.com>
4841a6
RH-MergeRequest: 189: block: Make bdrv_refresh_limits() non-recursive
4841a6
RH-Commit: [1/3] 1a1fe37f8d8f0344dd8639d6cc9d884d1aff9096
4841a6
RH-Bugzilla: 2072932
4841a6
RH-Acked-by: Eric Blake <eblake@redhat.com>
4841a6
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
4841a6
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
4841a6
4841a6
bdrv_refresh_limits() recurses down to the node's children.  That does
4841a6
not seem necessary: When we refresh limits on some node, and then
4841a6
recurse down and were to change one of its children's BlockLimits, then
4841a6
that would mean we noticed the changed limits by pure chance.  The fact
4841a6
that we refresh the parent's limits has nothing to do with it, so the
4841a6
reason for the change probably happened before this point in time, and
4841a6
we should have refreshed the limits then.
4841a6
4841a6
Consequently, we should actually propagate block limits changes upwards,
4841a6
not downwards.  That is a separate and pre-existing issue, though, and
4841a6
so will not be addressed in this patch.
4841a6
4841a6
The problem with recursing is that bdrv_refresh_limits() is not atomic.
4841a6
It begins with zeroing BDS.bl, and only then sets proper, valid limits.
4841a6
If we do not drain all nodes whose limits are refreshed, then concurrent
4841a6
I/O requests can encounter invalid request_alignment values and crash
4841a6
qemu.  Therefore, a recursing bdrv_refresh_limits() requires the whole
4841a6
subtree to be drained, which is currently not ensured by most callers.
4841a6
4841a6
A non-recursive bdrv_refresh_limits() only requires the node in question
4841a6
to not receive I/O requests, and this is done by most callers in some
4841a6
way or another:
4841a6
- bdrv_open_driver() deals with a new node with no parents yet
4841a6
- bdrv_set_file_or_backing_noperm() acts on a drained node
4841a6
- bdrv_reopen_commit() acts only on drained nodes
4841a6
- bdrv_append() should in theory require the node to be drained; in
4841a6
  practice most callers just lock the AioContext, which should at least
4841a6
  be enough to prevent concurrent I/O requests from accessing invalid
4841a6
  limits
4841a6
4841a6
So we can resolve the bug by making bdrv_refresh_limits() non-recursive.
4841a6
4841a6
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1879437
4841a6
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
4841a6
Reviewed-by: Eric Blake <eblake@redhat.com>
4841a6
Message-Id: <20220216105355.30729-2-hreitz@redhat.com>
4841a6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
4841a6
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4841a6
(cherry picked from commit 4d378bbd831bdd2f6e6adcd4ea5b77b6effaa627)
4841a6
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
4841a6
---
4841a6
 block/io.c | 4 ----
4841a6
 1 file changed, 4 deletions(-)
4841a6
4841a6
diff --git a/block/io.c b/block/io.c
4841a6
index 4e4cb556c5..c3e7301613 100644
4841a6
--- a/block/io.c
4841a6
+++ b/block/io.c
4841a6
@@ -189,10 +189,6 @@ void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, Error **errp)
4841a6
     QLIST_FOREACH(c, &bs->children, next) {
4841a6
         if (c->role & (BDRV_CHILD_DATA | BDRV_CHILD_FILTERED | BDRV_CHILD_COW))
4841a6
         {
4841a6
-            bdrv_refresh_limits(c->bs, tran, errp);
4841a6
-            if (*errp) {
4841a6
-                return;
4841a6
-            }
4841a6
             bdrv_merge_limits(&bs->bl, &c->bs->bl);
4841a6
             have_limits = true;
4841a6
         }
4841a6
-- 
4841a6
2.35.3
4841a6