Blame SOURCES/kvm-nbd-client-Lower-min_block-for-block-status-unaligne.patch

Pablo Greco e6a3ae
From 569674a3b855f516a8bec22ca365fc7614639ce6 Mon Sep 17 00:00:00 2001
Pablo Greco e6a3ae
From: Max Reitz <mreitz@redhat.com>
Pablo Greco e6a3ae
Date: Tue, 23 Jul 2019 14:45:42 +0100
Pablo Greco e6a3ae
Subject: [PATCH 04/14] nbd/client: Lower min_block for block-status, unaligned
Pablo Greco e6a3ae
 size
Pablo Greco e6a3ae
Pablo Greco e6a3ae
RH-Author: Max Reitz <mreitz@redhat.com>
Pablo Greco e6a3ae
Message-id: <20190723144546.23701-4-mreitz@redhat.com>
Pablo Greco e6a3ae
Patchwork-id: 89650
Pablo Greco e6a3ae
O-Subject: [RHEL-8.1.0 qemu-kvm PATCH 3/7] nbd/client: Lower min_block for block-status, unaligned size
Pablo Greco e6a3ae
Bugzilla: 1678979
Pablo Greco e6a3ae
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
Pablo Greco e6a3ae
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Pablo Greco e6a3ae
RH-Acked-by: John Snow <jsnow@redhat.com>
Pablo Greco e6a3ae
Pablo Greco e6a3ae
From: Eric Blake <eblake@redhat.com>
Pablo Greco e6a3ae
Pablo Greco e6a3ae
We have a latent bug in our NBD client code, tickled by the brand new
Pablo Greco e6a3ae
nbdkit 1.11.10 block status support:
Pablo Greco e6a3ae
Pablo Greco e6a3ae
$ nbdkit --filter=log --filter=truncate -U - \
Pablo Greco e6a3ae
           data data="1" size=511 truncate=64K logfile=/dev/stdout \
Pablo Greco e6a3ae
           --run 'qemu-img convert $nbd /var/tmp/out'
Pablo Greco e6a3ae
...
Pablo Greco e6a3ae
qemu-img: block/io.c:2122: bdrv_co_block_status: Assertion `*pnum && QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
The culprit? Our implementation of .bdrv_co_block_status can return
Pablo Greco e6a3ae
unaligned block status for any server that operates with a lower
Pablo Greco e6a3ae
actual alignment than what we tell the block layer in
Pablo Greco e6a3ae
request_alignment, in violation of the block layer's constraints. To
Pablo Greco e6a3ae
date, we've been unable to trip the bug, because qemu as NBD server
Pablo Greco e6a3ae
always advertises block sizing (at which point it is a server bug if
Pablo Greco e6a3ae
the server sends unaligned status - although qemu 3.1 is such a server
Pablo Greco e6a3ae
and I've sent separate patches for 4.0 both to get the server to obey
Pablo Greco e6a3ae
the spec, and to let the client to tolerate server oddities at EOF).
Pablo Greco e6a3ae
Pablo Greco e6a3ae
But nbdkit does not (yet) advertise block sizing, and therefore is not
Pablo Greco e6a3ae
in violation of the spec for returning block status at whatever
Pablo Greco e6a3ae
boundaries it wants, and those unaligned results can occur anywhere
Pablo Greco e6a3ae
rather than just at EOF. While we are still wise to avoid sending
Pablo Greco e6a3ae
sub-sector read/write requests to a server of unknown origin, we MUST
Pablo Greco e6a3ae
consider that a server telling us block status without an advertised
Pablo Greco e6a3ae
block size is correct.  So, we either have to munge unaligned answers
Pablo Greco e6a3ae
from the server into aligned ones that we hand back to the block
Pablo Greco e6a3ae
layer, or we have to tell the block layer about a smaller alignment.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Similarly, if the server advertises an image size that is not
Pablo Greco e6a3ae
sector-aligned, we might as well assume that the server intends to let
Pablo Greco e6a3ae
us access those tail bytes, and therefore supports a minimum block
Pablo Greco e6a3ae
size of 1, regardless of whether the server supports block status
Pablo Greco e6a3ae
(although we still need more patches to fix the problem that with an
Pablo Greco e6a3ae
unaligned image, we can send read or block status requests that exceed
Pablo Greco e6a3ae
EOF to the server). Again, qemu as server cannot trip this problem
Pablo Greco e6a3ae
(because it rounds images to sector alignment), but nbdkit advertised
Pablo Greco e6a3ae
unaligned size even before it gained block status support.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Solve both alignment problems at once by using better heuristics on
Pablo Greco e6a3ae
what alignment to report to the block layer when the server did not
Pablo Greco e6a3ae
give us something to work with. Note that very few NBD servers
Pablo Greco e6a3ae
implement block status (to date, only qemu and nbdkit are known to do
Pablo Greco e6a3ae
so); and as the NBD spec mentioned block sizing constraints prior to
Pablo Greco e6a3ae
documenting block status, it can be assumed that any future
Pablo Greco e6a3ae
implementations of block status are aware that they must advertise
Pablo Greco e6a3ae
block size if they want a minimum size other than 1.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
We've had a long history of struggles with picking the right alignment
Pablo Greco e6a3ae
to use in the block layer, as evidenced by the commit message of
Pablo Greco e6a3ae
fd8d372d (v2.12) that introduced the current choice of forced 512-byte
Pablo Greco e6a3ae
alignment.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
There is no iotest coverage for this fix, because qemu can't provoke
Pablo Greco e6a3ae
it, and I didn't want to make test 241 dependent on nbdkit.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Fixes: fd8d372d
Pablo Greco e6a3ae
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Pablo Greco e6a3ae
Signed-off-by: Eric Blake <eblake@redhat.com>
Pablo Greco e6a3ae
Message-Id: <20190329042750.14704-3-eblake@redhat.com>
Pablo Greco e6a3ae
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Pablo Greco e6a3ae
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Pablo Greco e6a3ae
(cherry picked from commit 7da537f70d929800ba9c657b8a47a7b827695ccc)
Pablo Greco e6a3ae
Signed-off-by: Max Reitz <mreitz@redhat.com>
Pablo Greco e6a3ae
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
Pablo Greco e6a3ae
---
Pablo Greco e6a3ae
 block/nbd.c | 19 ++++++++++++++++++-
Pablo Greco e6a3ae
 1 file changed, 18 insertions(+), 1 deletion(-)
Pablo Greco e6a3ae
Pablo Greco e6a3ae
diff --git a/block/nbd.c b/block/nbd.c
Pablo Greco e6a3ae
index f29c10f..3d642cd 100644
Pablo Greco e6a3ae
--- a/block/nbd.c
Pablo Greco e6a3ae
+++ b/block/nbd.c
Pablo Greco e6a3ae
@@ -473,7 +473,24 @@ static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
Pablo Greco e6a3ae
     uint32_t min = s->info.min_block;
Pablo Greco e6a3ae
     uint32_t max = MIN_NON_ZERO(NBD_MAX_BUFFER_SIZE, s->info.max_block);
Pablo Greco e6a3ae
 
Pablo Greco e6a3ae
-    bs->bl.request_alignment = min ? min : BDRV_SECTOR_SIZE;
Pablo Greco e6a3ae
+    /*
Pablo Greco e6a3ae
+     * If the server did not advertise an alignment:
Pablo Greco e6a3ae
+     * - a size that is not sector-aligned implies that an alignment
Pablo Greco e6a3ae
+     *   of 1 can be used to access those tail bytes
Pablo Greco e6a3ae
+     * - advertisement of block status requires an alignment of 1, so
Pablo Greco e6a3ae
+     *   that we don't violate block layer constraints that block
Pablo Greco e6a3ae
+     *   status is always aligned (as we can't control whether the
Pablo Greco e6a3ae
+     *   server will report sub-sector extents, such as a hole at EOF
Pablo Greco e6a3ae
+     *   on an unaligned POSIX file)
Pablo Greco e6a3ae
+     * - otherwise, assume the server is so old that we are safer avoiding
Pablo Greco e6a3ae
+     *   sub-sector requests
Pablo Greco e6a3ae
+     */
Pablo Greco e6a3ae
+    if (!min) {
Pablo Greco e6a3ae
+        min = (!QEMU_IS_ALIGNED(s->info.size, BDRV_SECTOR_SIZE) ||
Pablo Greco e6a3ae
+               s->info.base_allocation) ? 1 : BDRV_SECTOR_SIZE;
Pablo Greco e6a3ae
+    }
Pablo Greco e6a3ae
+
Pablo Greco e6a3ae
+    bs->bl.request_alignment = min;
Pablo Greco e6a3ae
     bs->bl.max_pdiscard = max;
Pablo Greco e6a3ae
     bs->bl.max_pwrite_zeroes = max;
Pablo Greco e6a3ae
     bs->bl.max_transfer = max;
Pablo Greco e6a3ae
-- 
Pablo Greco e6a3ae
1.8.3.1
Pablo Greco e6a3ae