cryptospore / rpms / qemu-kvm

Forked from rpms/qemu-kvm 2 years ago
Clone

Blame SOURCES/kvm-nbd-server-Advertise-actual-minimum-block-size.patch

4ec855
From 1832a90928232cb91a8542613b754079fd1f0f0e Mon Sep 17 00:00:00 2001
4ec855
From: Max Reitz <mreitz@redhat.com>
4ec855
Date: Tue, 23 Jul 2019 14:45:46 +0100
4ec855
Subject: [PATCH 08/14] nbd/server: Advertise actual minimum block size
4ec855
4ec855
RH-Author: Max Reitz <mreitz@redhat.com>
4ec855
Message-id: <20190723144546.23701-8-mreitz@redhat.com>
4ec855
Patchwork-id: 89652
4ec855
O-Subject: [RHEL-8.1.0 qemu-kvm PATCH 7/7] nbd/server: Advertise actual minimum block size
4ec855
Bugzilla: 1678979
4ec855
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
4ec855
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
4ec855
RH-Acked-by: John Snow <jsnow@redhat.com>
4ec855
4ec855
From: Eric Blake <eblake@redhat.com>
4ec855
4ec855
Both NBD_CMD_BLOCK_STATUS and structured NBD_CMD_READ will split their
4ec855
reply according to bdrv_block_status() boundaries. If the block device
4ec855
has a request_alignment smaller than 512, but we advertise a block
4ec855
alignment of 512 to the client, then this can result in the server
4ec855
reply violating client expectations by reporting a smaller region of
4ec855
the export than what the client is permitted to address (although this
4ec855
is less of an issue for qemu 4.0 clients, given recent client patches
4ec855
to overlook our non-compliance at EOF).  Since it's always better to
4ec855
be strict in what we send, it is worth advertising the actual minimum
4ec855
block limit rather than blindly rounding it up to 512.
4ec855
4ec855
Note that this patch is not foolproof - it is still possible to
4ec855
provoke non-compliant server behavior using:
4ec855
4ec855
$ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file
4ec855
4ec855
That is arguably a bug in the blkdebug driver (it should never pass
4ec855
back block status smaller than its alignment, even if it has to make
4ec855
multiple bdrv_get_status calls and determine the
4ec855
least-common-denominator status among the group to return). It may
4ec855
also be possible to observe issues with a backing layer with smaller
4ec855
alignment than the active layer, although so far I have been unable to
4ec855
write a reliable iotest for that scenario (but again, an issue like
4ec855
that could be argued to be a bug in the block layer, or something
4ec855
where we need a flag to bdrv_block_status() to state whether the
4ec855
result must be aligned to the current layer's limits or can be
4ec855
subdivided for accuracy when chasing backing files).
4ec855
4ec855
Anyways, as blkdebug is not normally used, and as this patch makes our
4ec855
server more interoperable with qemu 3.1 clients, it is worth applying
4ec855
now, even while we still work on a larger patch series for the 4.1
4ec855
timeframe to have byte-accurate file lengths.
4ec855
4ec855
Note that the iotests output changes - for 223 and 233, we can see the
4ec855
server's better granularity advertisement; and for 241, the three test
4ec855
cases have the following effects:
4ec855
- natural alignment: the server's smaller alignment is now advertised,
4ec855
and the hole reported at EOF is now the right result; we've gotten rid
4ec855
of the server's non-compliance
4ec855
- forced server alignment: the server still advertises 512 bytes, but
4ec855
still sends a mid-sector hole. This is still a server compliance bug,
4ec855
which needs to be fixed in the block layer in a later patch; output
4ec855
does not change because the client is already being tolerant of the
4ec855
non-compliance
4ec855
- forced client alignment: the server's smaller alignment means that
4ec855
the client now sees the server's status change mid-sector without any
4ec855
protocol violations, but the fact that the map shows an unaligned
4ec855
mid-sector hole is evidence of the block layer problems with aligned
4ec855
block status, to be fixed in a later patch
4ec855
4ec855
Signed-off-by: Eric Blake <eblake@redhat.com>
4ec855
Message-Id: <20190329042750.14704-7-eblake@redhat.com>
4ec855
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
4ec855
[eblake: rebase to enhanced iotest 241 coverage]
4ec855
(cherry picked from commit b0245d6478ea5906e3d7a542244d5c015fd47bc7)
4ec855
4ec855
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
4ec855
4ec855
Conflicts:
4ec855
- tests/qemu-iotests/223.out: We are missing
4ec855
  ddd09448fd833d646952c769ae9ce3d39bee989f downstream, which adds
4ec855
  qemu-nbd --list tests to 223.  (qemu-nbd --list does not exist
4ec855
  downstream.)
4ec855
4ec855
- tests/qemu-iotests/233.out: Does not exist downstream.
4ec855
4ec855
- tests/qemu-iotests/241.out: Does not exist downstream, because it
4ec855
  would require qemu-nbd --list.
4ec855
4ec855
Signed-off-by: Max Reitz <mreitz@redhat.com>
4ec855
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
4ec855
---
4ec855
 nbd/server.c | 13 ++++++++-----
4ec855
 1 file changed, 8 insertions(+), 5 deletions(-)
4ec855
4ec855
diff --git a/nbd/server.c b/nbd/server.c
4ec855
index e094300..96b6631 100644
4ec855
--- a/nbd/server.c
4ec855
+++ b/nbd/server.c
4ec855
@@ -608,13 +608,16 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
4ec855
     /* Send NBD_INFO_BLOCK_SIZE always, but tweak the minimum size
4ec855
      * according to whether the client requested it, and according to
4ec855
      * whether this is OPT_INFO or OPT_GO. */
4ec855
-    /* minimum - 1 for back-compat, or 512 if client is new enough.
4ec855
-     * TODO: consult blk_bs(blk)->bl.request_alignment? */
4ec855
-    sizes[0] =
4ec855
-            (client->opt == NBD_OPT_INFO || blocksize) ? BDRV_SECTOR_SIZE : 1;
4ec855
+    /* minimum - 1 for back-compat, or actual if client will obey it. */
4ec855
+    if (client->opt == NBD_OPT_INFO || blocksize) {
4ec855
+        sizes[0] = blk_get_request_alignment(exp->blk);
4ec855
+    } else {
4ec855
+        sizes[0] = 1;
4ec855
+    }
4ec855
+    assert(sizes[0] <= NBD_MAX_BUFFER_SIZE);
4ec855
     /* preferred - Hard-code to 4096 for now.
4ec855
      * TODO: is blk_bs(blk)->bl.opt_transfer appropriate? */
4ec855
-    sizes[1] = 4096;
4ec855
+    sizes[1] = MAX(4096, sizes[0]);
4ec855
     /* maximum - At most 32M, but smaller as appropriate. */
4ec855
     sizes[2] = MIN(blk_get_max_transfer(exp->blk), NBD_MAX_BUFFER_SIZE);
4ec855
     trace_nbd_negotiate_handle_info_block_size(sizes[0], sizes[1], sizes[2]);
4ec855
-- 
4ec855
1.8.3.1
4ec855