|
|
383d26 |
From 7122c1ace9649d525da8670d5e57aaa8b7c6a686 Mon Sep 17 00:00:00 2001
|
|
|
383d26 |
From: John Snow <jsnow@redhat.com>
|
|
|
383d26 |
Date: Mon, 6 May 2019 17:56:18 +0200
|
|
|
383d26 |
Subject: [PATCH 08/53] nbd/client: Lower min_block for block-status, unaligned
|
|
|
383d26 |
size
|
|
|
383d26 |
|
|
|
383d26 |
RH-Author: John Snow <jsnow@redhat.com>
|
|
|
383d26 |
Message-id: <20190506175629.11079-9-jsnow@redhat.com>
|
|
|
383d26 |
Patchwork-id: 87187
|
|
|
383d26 |
O-Subject: [RHEL-7.7 qemu-kvm-rhev PATCH 08/19] nbd/client: Lower min_block for block-status, unaligned size
|
|
|
383d26 |
Bugzilla: 1692018
|
|
|
383d26 |
RH-Acked-by: Max Reitz <mreitz@redhat.com>
|
|
|
383d26 |
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
|
|
|
383d26 |
RH-Acked-by: Thomas Huth <thuth@redhat.com>
|
|
|
383d26 |
|
|
|
383d26 |
From: Eric Blake <eblake@redhat.com>
|
|
|
383d26 |
|
|
|
383d26 |
We have a latent bug in our NBD client code, tickled by the brand new
|
|
|
383d26 |
nbdkit 1.11.10 block status support:
|
|
|
383d26 |
|
|
|
383d26 |
$ nbdkit --filter=log --filter=truncate -U - \
|
|
|
383d26 |
data data="1" size=511 truncate=64K logfile=/dev/stdout \
|
|
|
383d26 |
--run 'qemu-img convert $nbd /var/tmp/out'
|
|
|
383d26 |
...
|
|
|
383d26 |
qemu-img: block/io.c:2122: bdrv_co_block_status: Assertion `*pnum && QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed.
|
|
|
383d26 |
|
|
|
383d26 |
The culprit? Our implementation of .bdrv_co_block_status can return
|
|
|
383d26 |
unaligned block status for any server that operates with a lower
|
|
|
383d26 |
actual alignment than what we tell the block layer in
|
|
|
383d26 |
request_alignment, in violation of the block layer's constraints. To
|
|
|
383d26 |
date, we've been unable to trip the bug, because qemu as NBD server
|
|
|
383d26 |
always advertises block sizing (at which point it is a server bug if
|
|
|
383d26 |
the server sends unaligned status - although qemu 3.1 is such a server
|
|
|
383d26 |
and I've sent separate patches for 4.0 both to get the server to obey
|
|
|
383d26 |
the spec, and to let the client to tolerate server oddities at EOF).
|
|
|
383d26 |
|
|
|
383d26 |
But nbdkit does not (yet) advertise block sizing, and therefore is not
|
|
|
383d26 |
in violation of the spec for returning block status at whatever
|
|
|
383d26 |
boundaries it wants, and those unaligned results can occur anywhere
|
|
|
383d26 |
rather than just at EOF. While we are still wise to avoid sending
|
|
|
383d26 |
sub-sector read/write requests to a server of unknown origin, we MUST
|
|
|
383d26 |
consider that a server telling us block status without an advertised
|
|
|
383d26 |
block size is correct. So, we either have to munge unaligned answers
|
|
|
383d26 |
from the server into aligned ones that we hand back to the block
|
|
|
383d26 |
layer, or we have to tell the block layer about a smaller alignment.
|
|
|
383d26 |
|
|
|
383d26 |
Similarly, if the server advertises an image size that is not
|
|
|
383d26 |
sector-aligned, we might as well assume that the server intends to let
|
|
|
383d26 |
us access those tail bytes, and therefore supports a minimum block
|
|
|
383d26 |
size of 1, regardless of whether the server supports block status
|
|
|
383d26 |
(although we still need more patches to fix the problem that with an
|
|
|
383d26 |
unaligned image, we can send read or block status requests that exceed
|
|
|
383d26 |
EOF to the server). Again, qemu as server cannot trip this problem
|
|
|
383d26 |
(because it rounds images to sector alignment), but nbdkit advertised
|
|
|
383d26 |
unaligned size even before it gained block status support.
|
|
|
383d26 |
|
|
|
383d26 |
Solve both alignment problems at once by using better heuristics on
|
|
|
383d26 |
what alignment to report to the block layer when the server did not
|
|
|
383d26 |
give us something to work with. Note that very few NBD servers
|
|
|
383d26 |
implement block status (to date, only qemu and nbdkit are known to do
|
|
|
383d26 |
so); and as the NBD spec mentioned block sizing constraints prior to
|
|
|
383d26 |
documenting block status, it can be assumed that any future
|
|
|
383d26 |
implementations of block status are aware that they must advertise
|
|
|
383d26 |
block size if they want a minimum size other than 1.
|
|
|
383d26 |
|
|
|
383d26 |
We've had a long history of struggles with picking the right alignment
|
|
|
383d26 |
to use in the block layer, as evidenced by the commit message of
|
|
|
383d26 |
fd8d372d (v2.12) that introduced the current choice of forced 512-byte
|
|
|
383d26 |
alignment.
|
|
|
383d26 |
|
|
|
383d26 |
There is no iotest coverage for this fix, because qemu can't provoke
|
|
|
383d26 |
it, and I didn't want to make test 241 dependent on nbdkit.
|
|
|
383d26 |
|
|
|
383d26 |
Fixes: fd8d372d
|
|
|
383d26 |
Reported-by: Richard W.M. Jones <rjones@redhat.com>
|
|
|
383d26 |
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
|
383d26 |
Message-Id: <20190329042750.14704-3-eblake@redhat.com>
|
|
|
383d26 |
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
|
|
|
383d26 |
Tested-by: Richard W.M. Jones <rjones@redhat.com>
|
|
|
383d26 |
(cherry picked from commit 7da537f70d929800ba9c657b8a47a7b827695ccc)
|
|
|
383d26 |
Signed-off-by: John Snow <jsnow@redhat.com>
|
|
|
383d26 |
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
|
|
|
383d26 |
---
|
|
|
383d26 |
block/nbd.c | 19 ++++++++++++++++++-
|
|
|
383d26 |
1 file changed, 18 insertions(+), 1 deletion(-)
|
|
|
383d26 |
|
|
|
383d26 |
diff --git a/block/nbd.c b/block/nbd.c
|
|
|
383d26 |
index 838a8fe..670b9bd 100644
|
|
|
383d26 |
--- a/block/nbd.c
|
|
|
383d26 |
+++ b/block/nbd.c
|
|
|
383d26 |
@@ -437,7 +437,24 @@ static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
|
|
|
383d26 |
uint32_t min = s->info.min_block;
|
|
|
383d26 |
uint32_t max = MIN_NON_ZERO(NBD_MAX_BUFFER_SIZE, s->info.max_block);
|
|
|
383d26 |
|
|
|
383d26 |
- bs->bl.request_alignment = min ? min : BDRV_SECTOR_SIZE;
|
|
|
383d26 |
+ /*
|
|
|
383d26 |
+ * If the server did not advertise an alignment:
|
|
|
383d26 |
+ * - a size that is not sector-aligned implies that an alignment
|
|
|
383d26 |
+ * of 1 can be used to access those tail bytes
|
|
|
383d26 |
+ * - advertisement of block status requires an alignment of 1, so
|
|
|
383d26 |
+ * that we don't violate block layer constraints that block
|
|
|
383d26 |
+ * status is always aligned (as we can't control whether the
|
|
|
383d26 |
+ * server will report sub-sector extents, such as a hole at EOF
|
|
|
383d26 |
+ * on an unaligned POSIX file)
|
|
|
383d26 |
+ * - otherwise, assume the server is so old that we are safer avoiding
|
|
|
383d26 |
+ * sub-sector requests
|
|
|
383d26 |
+ */
|
|
|
383d26 |
+ if (!min) {
|
|
|
383d26 |
+ min = (!QEMU_IS_ALIGNED(s->info.size, BDRV_SECTOR_SIZE) ||
|
|
|
383d26 |
+ s->info.base_allocation) ? 1 : BDRV_SECTOR_SIZE;
|
|
|
383d26 |
+ }
|
|
|
383d26 |
+
|
|
|
383d26 |
+ bs->bl.request_alignment = min;
|
|
|
383d26 |
bs->bl.max_pdiscard = max;
|
|
|
383d26 |
bs->bl.max_pwrite_zeroes = max;
|
|
|
383d26 |
bs->bl.max_transfer = max;
|
|
|
383d26 |
--
|
|
|
383d26 |
1.8.3.1
|
|
|
383d26 |
|