Pablo Greco e6a3ae
From 1832a90928232cb91a8542613b754079fd1f0f0e Mon Sep 17 00:00:00 2001
Pablo Greco e6a3ae
From: Max Reitz <mreitz@redhat.com>
Pablo Greco e6a3ae
Date: Tue, 23 Jul 2019 14:45:46 +0100
Pablo Greco e6a3ae
Subject: [PATCH 08/14] nbd/server: Advertise actual minimum block size
Pablo Greco e6a3ae
Pablo Greco e6a3ae
RH-Author: Max Reitz <mreitz@redhat.com>
Pablo Greco e6a3ae
Message-id: <20190723144546.23701-8-mreitz@redhat.com>
Pablo Greco e6a3ae
Patchwork-id: 89652
Pablo Greco e6a3ae
O-Subject: [RHEL-8.1.0 qemu-kvm PATCH 7/7] nbd/server: Advertise actual minimum block size
Pablo Greco e6a3ae
Bugzilla: 1678979
Pablo Greco e6a3ae
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
Pablo Greco e6a3ae
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Pablo Greco e6a3ae
RH-Acked-by: John Snow <jsnow@redhat.com>
Pablo Greco e6a3ae
Pablo Greco e6a3ae
From: Eric Blake <eblake@redhat.com>
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Both NBD_CMD_BLOCK_STATUS and structured NBD_CMD_READ will split their
Pablo Greco e6a3ae
reply according to bdrv_block_status() boundaries. If the block device
Pablo Greco e6a3ae
has a request_alignment smaller than 512, but we advertise a block
Pablo Greco e6a3ae
alignment of 512 to the client, then this can result in the server
Pablo Greco e6a3ae
reply violating client expectations by reporting a smaller region of
Pablo Greco e6a3ae
the export than what the client is permitted to address (although this
Pablo Greco e6a3ae
is less of an issue for qemu 4.0 clients, given recent client patches
Pablo Greco e6a3ae
to overlook our non-compliance at EOF).  Since it's always better to
Pablo Greco e6a3ae
be strict in what we send, it is worth advertising the actual minimum
Pablo Greco e6a3ae
block limit rather than blindly rounding it up to 512.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Note that this patch is not foolproof - it is still possible to
Pablo Greco e6a3ae
provoke non-compliant server behavior using:
Pablo Greco e6a3ae
Pablo Greco e6a3ae
$ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file
Pablo Greco e6a3ae
Pablo Greco e6a3ae
That is arguably a bug in the blkdebug driver (it should never pass
Pablo Greco e6a3ae
back block status smaller than its alignment, even if it has to make
Pablo Greco e6a3ae
multiple bdrv_get_status calls and determine the
Pablo Greco e6a3ae
least-common-denominator status among the group to return). It may
Pablo Greco e6a3ae
also be possible to observe issues with a backing layer with smaller
Pablo Greco e6a3ae
alignment than the active layer, although so far I have been unable to
Pablo Greco e6a3ae
write a reliable iotest for that scenario (but again, an issue like
Pablo Greco e6a3ae
that could be argued to be a bug in the block layer, or something
Pablo Greco e6a3ae
where we need a flag to bdrv_block_status() to state whether the
Pablo Greco e6a3ae
result must be aligned to the current layer's limits or can be
Pablo Greco e6a3ae
subdivided for accuracy when chasing backing files).
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Anyways, as blkdebug is not normally used, and as this patch makes our
Pablo Greco e6a3ae
server more interoperable with qemu 3.1 clients, it is worth applying
Pablo Greco e6a3ae
now, even while we still work on a larger patch series for the 4.1
Pablo Greco e6a3ae
timeframe to have byte-accurate file lengths.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Note that the iotests output changes - for 223 and 233, we can see the
Pablo Greco e6a3ae
server's better granularity advertisement; and for 241, the three test
Pablo Greco e6a3ae
cases have the following effects:
Pablo Greco e6a3ae
- natural alignment: the server's smaller alignment is now advertised,
Pablo Greco e6a3ae
and the hole reported at EOF is now the right result; we've gotten rid
Pablo Greco e6a3ae
of the server's non-compliance
Pablo Greco e6a3ae
- forced server alignment: the server still advertises 512 bytes, but
Pablo Greco e6a3ae
still sends a mid-sector hole. This is still a server compliance bug,
Pablo Greco e6a3ae
which needs to be fixed in the block layer in a later patch; output
Pablo Greco e6a3ae
does not change because the client is already being tolerant of the
Pablo Greco e6a3ae
non-compliance
Pablo Greco e6a3ae
- forced client alignment: the server's smaller alignment means that
Pablo Greco e6a3ae
the client now sees the server's status change mid-sector without any
Pablo Greco e6a3ae
protocol violations, but the fact that the map shows an unaligned
Pablo Greco e6a3ae
mid-sector hole is evidence of the block layer problems with aligned
Pablo Greco e6a3ae
block status, to be fixed in a later patch
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Signed-off-by: Eric Blake <eblake@redhat.com>
Pablo Greco e6a3ae
Message-Id: <20190329042750.14704-7-eblake@redhat.com>
Pablo Greco e6a3ae
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Pablo Greco e6a3ae
[eblake: rebase to enhanced iotest 241 coverage]
Pablo Greco e6a3ae
(cherry picked from commit b0245d6478ea5906e3d7a542244d5c015fd47bc7)
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Conflicts:
Pablo Greco e6a3ae
- tests/qemu-iotests/223.out: We are missing
Pablo Greco e6a3ae
  ddd09448fd833d646952c769ae9ce3d39bee989f downstream, which adds
Pablo Greco e6a3ae
  qemu-nbd --list tests to 223.  (qemu-nbd --list does not exist
Pablo Greco e6a3ae
  downstream.)
Pablo Greco e6a3ae
Pablo Greco e6a3ae
- tests/qemu-iotests/233.out: Does not exist downstream.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
- tests/qemu-iotests/241.out: Does not exist downstream, because it
Pablo Greco e6a3ae
  would require qemu-nbd --list.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Signed-off-by: Max Reitz <mreitz@redhat.com>
Pablo Greco e6a3ae
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
Pablo Greco e6a3ae
---
Pablo Greco e6a3ae
 nbd/server.c | 13 ++++++++-----
Pablo Greco e6a3ae
 1 file changed, 8 insertions(+), 5 deletions(-)
Pablo Greco e6a3ae
Pablo Greco e6a3ae
diff --git a/nbd/server.c b/nbd/server.c
Pablo Greco e6a3ae
index e094300..96b6631 100644
Pablo Greco e6a3ae
--- a/nbd/server.c
Pablo Greco e6a3ae
+++ b/nbd/server.c
Pablo Greco e6a3ae
@@ -608,13 +608,16 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
Pablo Greco e6a3ae
     /* Send NBD_INFO_BLOCK_SIZE always, but tweak the minimum size
Pablo Greco e6a3ae
      * according to whether the client requested it, and according to
Pablo Greco e6a3ae
      * whether this is OPT_INFO or OPT_GO. */
Pablo Greco e6a3ae
-    /* minimum - 1 for back-compat, or 512 if client is new enough.
Pablo Greco e6a3ae
-     * TODO: consult blk_bs(blk)->bl.request_alignment? */
Pablo Greco e6a3ae
-    sizes[0] =
Pablo Greco e6a3ae
-            (client->opt == NBD_OPT_INFO || blocksize) ? BDRV_SECTOR_SIZE : 1;
Pablo Greco e6a3ae
+    /* minimum - 1 for back-compat, or actual if client will obey it. */
Pablo Greco e6a3ae
+    if (client->opt == NBD_OPT_INFO || blocksize) {
Pablo Greco e6a3ae
+        sizes[0] = blk_get_request_alignment(exp->blk);
Pablo Greco e6a3ae
+    } else {
Pablo Greco e6a3ae
+        sizes[0] = 1;
Pablo Greco e6a3ae
+    }
Pablo Greco e6a3ae
+    assert(sizes[0] <= NBD_MAX_BUFFER_SIZE);
Pablo Greco e6a3ae
     /* preferred - Hard-code to 4096 for now.
Pablo Greco e6a3ae
      * TODO: is blk_bs(blk)->bl.opt_transfer appropriate? */
Pablo Greco e6a3ae
-    sizes[1] = 4096;
Pablo Greco e6a3ae
+    sizes[1] = MAX(4096, sizes[0]);
Pablo Greco e6a3ae
     /* maximum - At most 32M, but smaller as appropriate. */
Pablo Greco e6a3ae
     sizes[2] = MIN(blk_get_max_transfer(exp->blk), NBD_MAX_BUFFER_SIZE);
Pablo Greco e6a3ae
     trace_nbd_negotiate_handle_info_block_size(sizes[0], sizes[1], sizes[2]);
Pablo Greco e6a3ae
-- 
Pablo Greco e6a3ae
1.8.3.1
Pablo Greco e6a3ae