Blame SOURCES/kvm-qcow2-Give-the-refcount-cache-the-minimum-possible-s.patch

ae23c9
From f9faa15ed2a819c8fcf1eaf3534d7162f9cb8290 Mon Sep 17 00:00:00 2001
ae23c9
From: Kevin Wolf <kwolf@redhat.com>
ae23c9
Date: Thu, 6 Dec 2018 17:12:26 +0000
ae23c9
Subject: [PATCH 01/15] qcow2: Give the refcount cache the minimum possible
ae23c9
 size by default
ae23c9
ae23c9
RH-Author: Kevin Wolf <kwolf@redhat.com>
ae23c9
Message-id: <20181206171240.5674-2-kwolf@redhat.com>
ae23c9
Patchwork-id: 83284
ae23c9
O-Subject: [RHEL-8.0 qemu-kvm PATCH 01/15] qcow2: Give the refcount cache the minimum possible size by default
ae23c9
Bugzilla: 1656507
ae23c9
RH-Acked-by: Max Reitz <mreitz@redhat.com>
ae23c9
RH-Acked-by: John Snow <jsnow@redhat.com>
ae23c9
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
ae23c9
ae23c9
From: Alberto Garcia <berto@igalia.com>
ae23c9
ae23c9
The L2 and refcount caches have default sizes that can be overridden
ae23c9
using the l2-cache-size and refcount-cache-size (an additional
ae23c9
parameter named cache-size sets the combined size of both caches).
ae23c9
ae23c9
Unless forced by one of the aforementioned parameters, QEMU will set
ae23c9
the unspecified sizes so that the L2 cache is 4 times larger than the
ae23c9
refcount cache.
ae23c9
ae23c9
This is based on the premise that the refcount metadata needs to be
ae23c9
only a fourth of the L2 metadata to cover the same amount of disk
ae23c9
space. This is incorrect for two reasons:
ae23c9
ae23c9
 a) The amount of disk covered by an L2 table depends solely on the
ae23c9
    cluster size, but in the case of a refcount block it depends on
ae23c9
    the cluster size *and* the width of each refcount entry.
ae23c9
    The 4/1 ratio is only valid with 16-bit entries (the default).
ae23c9
ae23c9
 b) When we talk about disk space and L2 tables we are talking about
ae23c9
    guest space (L2 tables map guest clusters to host clusters),
ae23c9
    whereas refcount blocks are used for host clusters (including
ae23c9
    L1/L2 tables and the refcount blocks themselves). On a fully
ae23c9
    populated (and uncompressed) qcow2 file, image size > virtual size
ae23c9
    so there are more refcount entries than L2 entries.
ae23c9
ae23c9
Problem (a) could be fixed by adjusting the algorithm to take into
ae23c9
account the refcount entry width. Problem (b) could be fixed by
ae23c9
increasing a bit the refcount cache size to account for the clusters
ae23c9
used for qcow2 metadata.
ae23c9
ae23c9
However this patch takes a completely different approach and instead
ae23c9
of keeping a ratio between both cache sizes it assigns as much as
ae23c9
possible to the L2 cache and the remainder to the refcount cache.
ae23c9
ae23c9
The reason is that L2 tables are used for every single I/O request
ae23c9
from the guest and the effect of increasing the cache is significant
ae23c9
and clearly measurable. Refcount blocks are however only used for
ae23c9
cluster allocation and internal snapshots and in practice are accessed
ae23c9
sequentially in most cases, so the effect of increasing the cache is
ae23c9
negligible (even when doing random writes from the guest).
ae23c9
ae23c9
So, make the refcount cache as small as possible unless the user
ae23c9
explicitly asks for a larger one.
ae23c9
ae23c9
Signed-off-by: Alberto Garcia <berto@igalia.com>
ae23c9
Reviewed-by: Eric Blake <eblake@redhat.com>
ae23c9
Reviewed-by: Max Reitz <mreitz@redhat.com>
ae23c9
Message-id: 9695182c2eb11b77cb319689a1ebaa4e7c9d6591.1523968389.git.berto@igalia.com
ae23c9
Signed-off-by: Max Reitz <mreitz@redhat.com>
ae23c9
(cherry picked from commit 52253998ec3e523c9e20ae81e2a6431d8ff733ba)
ae23c9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
ae23c9
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
ae23c9
---
ae23c9
 block/qcow2.c              | 31 +++++++++++++++++++------------
ae23c9
 block/qcow2.h              |  4 ----
ae23c9
 tests/qemu-iotests/137.out |  2 +-
ae23c9
 3 files changed, 20 insertions(+), 17 deletions(-)
ae23c9
ae23c9
diff --git a/block/qcow2.c b/block/qcow2.c
ae23c9
index 36d1152..4b65e4c 100644
ae23c9
--- a/block/qcow2.c
ae23c9
+++ b/block/qcow2.c
ae23c9
@@ -809,23 +809,30 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
ae23c9
         } else if (refcount_cache_size_set) {
ae23c9
             *l2_cache_size = combined_cache_size - *refcount_cache_size;
ae23c9
         } else {
ae23c9
-            *refcount_cache_size = combined_cache_size
ae23c9
-                                 / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
ae23c9
-            *l2_cache_size = combined_cache_size - *refcount_cache_size;
ae23c9
+            uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
ae23c9
+            uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
ae23c9
+            uint64_t min_refcount_cache =
ae23c9
+                (uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
ae23c9
+
ae23c9
+            /* Assign as much memory as possible to the L2 cache, and
ae23c9
+             * use the remainder for the refcount cache */
ae23c9
+            if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
ae23c9
+                *l2_cache_size = max_l2_cache;
ae23c9
+                *refcount_cache_size = combined_cache_size - *l2_cache_size;
ae23c9
+            } else {
ae23c9
+                *refcount_cache_size =
ae23c9
+                    MIN(combined_cache_size, min_refcount_cache);
ae23c9
+                *l2_cache_size = combined_cache_size - *refcount_cache_size;
ae23c9
+            }
ae23c9
         }
ae23c9
     } else {
ae23c9
-        if (!l2_cache_size_set && !refcount_cache_size_set) {
ae23c9
+        if (!l2_cache_size_set) {
ae23c9
             *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
ae23c9
                                  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
ae23c9
                                  * s->cluster_size);
ae23c9
-            *refcount_cache_size = *l2_cache_size
ae23c9
-                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
ae23c9
-        } else if (!l2_cache_size_set) {
ae23c9
-            *l2_cache_size = *refcount_cache_size
ae23c9
-                           * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
ae23c9
-        } else if (!refcount_cache_size_set) {
ae23c9
-            *refcount_cache_size = *l2_cache_size
ae23c9
-                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
ae23c9
+        }
ae23c9
+        if (!refcount_cache_size_set) {
ae23c9
+            *refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
ae23c9
         }
ae23c9
     }
ae23c9
 
ae23c9
diff --git a/block/qcow2.h b/block/qcow2.h
ae23c9
index 43163b2..3d92cdb 100644
ae23c9
--- a/block/qcow2.h
ae23c9
+++ b/block/qcow2.h
ae23c9
@@ -77,10 +77,6 @@
ae23c9
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
ae23c9
 #define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
ae23c9
 
ae23c9
-/* The refblock cache needs only a fourth of the L2 cache size to cover as many
ae23c9
- * clusters */
ae23c9
-#define DEFAULT_L2_REFCOUNT_SIZE_RATIO 4
ae23c9
-
ae23c9
 #define DEFAULT_CLUSTER_SIZE 65536
ae23c9
 
ae23c9
 
ae23c9
diff --git a/tests/qemu-iotests/137.out b/tests/qemu-iotests/137.out
ae23c9
index e28e1ea..96724a6 100644
ae23c9
--- a/tests/qemu-iotests/137.out
ae23c9
+++ b/tests/qemu-iotests/137.out
ae23c9
@@ -22,7 +22,7 @@ refcount-cache-size may not exceed cache-size
ae23c9
 L2 cache size too big
ae23c9
 L2 cache entry size must be a power of two between 512 and the cluster size (65536)
ae23c9
 L2 cache entry size must be a power of two between 512 and the cluster size (65536)
ae23c9
-L2 cache size too big
ae23c9
+Refcount cache size too big
ae23c9
 Conflicting values for qcow2 options 'overlap-check' ('constant') and 'overlap-check.template' ('all')
ae23c9
 Unsupported value 'blubb' for qcow2 option 'overlap-check'. Allowed are any of the following: none, constant, cached, all
ae23c9
 Unsupported value 'blubb' for qcow2 option 'overlap-check'. Allowed are any of the following: none, constant, cached, all
ae23c9
-- 
ae23c9
1.8.3.1
ae23c9