Pablo Greco e6a3ae
From 29592218d57f1fe49c1254fffd9b0206cfe29ec7 Mon Sep 17 00:00:00 2001
Pablo Greco e6a3ae
From: Max Reitz <mreitz@redhat.com>
Pablo Greco e6a3ae
Date: Tue, 23 Jul 2019 14:45:40 +0100
Pablo Greco e6a3ae
Subject: [PATCH 02/14] block/file-posix: Unaligned O_DIRECT block-status
Pablo Greco e6a3ae
Pablo Greco e6a3ae
RH-Author: Max Reitz <mreitz@redhat.com>
Pablo Greco e6a3ae
Message-id: <20190723144546.23701-2-mreitz@redhat.com>
Pablo Greco e6a3ae
Patchwork-id: 89647
Pablo Greco e6a3ae
O-Subject: [RHEL-8.1.0 qemu-kvm PATCH 1/7] block/file-posix: Unaligned O_DIRECT block-status
Pablo Greco e6a3ae
Bugzilla: 1678979
Pablo Greco e6a3ae
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
Pablo Greco e6a3ae
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Pablo Greco e6a3ae
RH-Acked-by: John Snow <jsnow@redhat.com>
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Currently, qemu crashes whenever someone queries the block status of an
Pablo Greco e6a3ae
unaligned image tail of an O_DIRECT image:
Pablo Greco e6a3ae
$ echo > foo
Pablo Greco e6a3ae
$ qemu-img map --image-opts driver=file,filename=foo,cache.direct=on
Pablo Greco e6a3ae
Offset          Length          Mapped to       File
Pablo Greco e6a3ae
qemu-img: block/io.c:2093: bdrv_co_block_status: Assertion `*pnum &&
Pablo Greco e6a3ae
QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset'
Pablo Greco e6a3ae
failed.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
This is because bdrv_co_block_status() checks that the result returned
Pablo Greco e6a3ae
by the driver's implementation is aligned to the request_alignment, but
Pablo Greco e6a3ae
file-posix can fail to do so, which is actually mentioned in a comment
Pablo Greco e6a3ae
there: "[...] possibly including a partial sector at EOF".
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Fix this by rounding up those partial sectors.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
There are two possible alternative fixes:
Pablo Greco e6a3ae
(1) We could refuse to open unaligned image files with O_DIRECT
Pablo Greco e6a3ae
    altogether.  That sounds reasonable until you realize that qcow2
Pablo Greco e6a3ae
    does necessarily not fill up its metadata clusters, and that nobody
Pablo Greco e6a3ae
    runs qemu-img create with O_DIRECT.  Therefore, unpreallocated qcow2
Pablo Greco e6a3ae
    files usually have an unaligned image tail.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
(2) bdrv_co_block_status() could ignore unaligned tails.  It actually
Pablo Greco e6a3ae
    throws away everything past the EOF already, so that sounds
Pablo Greco e6a3ae
    reasonable.
Pablo Greco e6a3ae
    Unfortunately, the block layer knows file lengths only with a
Pablo Greco e6a3ae
    granularity of BDRV_SECTOR_SIZE, so bdrv_co_block_status() usually
Pablo Greco e6a3ae
    would have to guess whether its file length information is inexact
Pablo Greco e6a3ae
    or whether the driver is broken.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Fixing what raw_co_block_status() returns is the safest thing to do.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
There seems to be no other block driver that sets request_alignment and
Pablo Greco e6a3ae
does not make sure that it always returns aligned values.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Cc: qemu-stable@nongnu.org
Pablo Greco e6a3ae
Signed-off-by: Max Reitz <mreitz@redhat.com>
Pablo Greco e6a3ae
Reviewed-by: Eric Blake <eblake@redhat.com>
Pablo Greco e6a3ae
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Pablo Greco e6a3ae
(cherry picked from commit 9c3db310ff0b7473272ae8dce5e04e2f8a825390)
Pablo Greco e6a3ae
Signed-off-by: Max Reitz <mreitz@redhat.com>
Pablo Greco e6a3ae
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
Pablo Greco e6a3ae
---
Pablo Greco e6a3ae
 block/file-posix.c | 16 ++++++++++++++++
Pablo Greco e6a3ae
 1 file changed, 16 insertions(+)
Pablo Greco e6a3ae
Pablo Greco e6a3ae
diff --git a/block/file-posix.c b/block/file-posix.c
Pablo Greco e6a3ae
index 5fb5a9a..4b404e4 100644
Pablo Greco e6a3ae
--- a/block/file-posix.c
Pablo Greco e6a3ae
+++ b/block/file-posix.c
Pablo Greco e6a3ae
@@ -2413,6 +2413,8 @@ static int coroutine_fn raw_co_block_status(BlockDriverState *bs,
Pablo Greco e6a3ae
     off_t data = 0, hole = 0;
Pablo Greco e6a3ae
     int ret;
Pablo Greco e6a3ae
 
Pablo Greco e6a3ae
+    assert(QEMU_IS_ALIGNED(offset | bytes, bs->bl.request_alignment));
Pablo Greco e6a3ae
+
Pablo Greco e6a3ae
     ret = fd_open(bs);
Pablo Greco e6a3ae
     if (ret < 0) {
Pablo Greco e6a3ae
         return ret;
Pablo Greco e6a3ae
@@ -2438,6 +2440,20 @@ static int coroutine_fn raw_co_block_status(BlockDriverState *bs,
Pablo Greco e6a3ae
         /* On a data extent, compute bytes to the end of the extent,
Pablo Greco e6a3ae
          * possibly including a partial sector at EOF. */
Pablo Greco e6a3ae
         *pnum = MIN(bytes, hole - offset);
Pablo Greco e6a3ae
+
Pablo Greco e6a3ae
+        /*
Pablo Greco e6a3ae
+         * We are not allowed to return partial sectors, though, so
Pablo Greco e6a3ae
+         * round up if necessary.
Pablo Greco e6a3ae
+         */
Pablo Greco e6a3ae
+        if (!QEMU_IS_ALIGNED(*pnum, bs->bl.request_alignment)) {
Pablo Greco e6a3ae
+            int64_t file_length = raw_getlength(bs);
Pablo Greco e6a3ae
+            if (file_length > 0) {
Pablo Greco e6a3ae
+                /* Ignore errors, this is just a safeguard */
Pablo Greco e6a3ae
+                assert(hole == file_length);
Pablo Greco e6a3ae
+            }
Pablo Greco e6a3ae
+            *pnum = ROUND_UP(*pnum, bs->bl.request_alignment);
Pablo Greco e6a3ae
+        }
Pablo Greco e6a3ae
+
Pablo Greco e6a3ae
         ret = BDRV_BLOCK_DATA;
Pablo Greco e6a3ae
     } else {
Pablo Greco e6a3ae
         /* On a hole, compute bytes to the beginning of the next extent.  */
Pablo Greco e6a3ae
-- 
Pablo Greco e6a3ae
1.8.3.1
Pablo Greco e6a3ae