d2787b
From be3448ed5d9d59752cff4df8325ee67eb7d41531 Mon Sep 17 00:00:00 2001
d2787b
From: Xavi Hernandez <xhernandez@redhat.com>
d2787b
Date: Mon, 19 Jul 2021 06:56:18 +0200
d2787b
Subject: [PATCH 592/610] md-cache: fix integer signedness mismatch
d2787b
d2787b
md-cache uses a mechanism based on a generation number to detect
d2787b
modifications made by other clients to the entries and invalidate
d2787b
the cached data.
d2787b
d2787b
This generation number is a 32 bit integer. When it overflows,
d2787b
special management is done to avoid problems. This overflow condition
d2787b
is tracked with a single bit.
d2787b
d2787b
For many fops, when they are received, the overflow bit and the
d2787b
current generation number are recorded in a single 64-bit value
d2787b
which is used later in the cbk.
d2787b
d2787b
This is the problematic function:
d2787b
d2787b
    uint64_t
d2787b
    __mdc_get_generation(xlator_t *this, struct md_cache *mdc)
d2787b
    {
d2787b
        uint64_t gen = 0, rollover;
d2787b
        struct mdc_conf *conf = NULL;
d2787b
d2787b
        conf = this->private;
d2787b
d2787b
        gen = GF_ATOMIC_INC(conf->generation);
d2787b
        if (gen == 0) {
d2787b
            gf_log("MDC", GF_LOG_NOTICE, "%p Reset 1", mdc);
d2787b
            mdc->gen_rollover = !mdc->gen_rollover;
d2787b
            gen = GF_ATOMIC_INC(conf->generation);
d2787b
            mdc->ia_time = 0;
d2787b
            mdc->generation = 0;
d2787b
            mdc->invalidation_time = gen - 1;
d2787b
        }
d2787b
d2787b
        rollover = mdc->gen_rollover;
d2787b
        gen |= (rollover << 32);
d2787b
        return gen;
d2787b
    }
d2787b
d2787b
'conf->generation' is declared as an atomic signed 32-bit integer,
d2787b
and 'gen' is an unsigned 64-bit value. When 'gen' is assigned from
d2787b
a signed int, the sign bit is extended to fill the high 32 bits of
d2787b
'gen'. If the counter has overflown the maximum signed positive
d2787b
value, it will become negative (sign bit = 1).
d2787b
d2787b
In this case, when 'rollover' is later combined with 'gen', all the
d2787b
high bits remain at '1'.
d2787b
d2787b
This value is used later in 'mdc_inode_iatt_set_validate' during
d2787b
callback processing. The overflow condition and generation numbers
d2787b
from when the operation was received are recovered this way:
d2787b
d2787b
    rollover = incident_time >> 32;
d2787b
    incident_time = (incident_time & 0xffffffff);
d2787b
d2787b
('incident_time' is the saved value from '__mdc_get_generation').
d2787b
d2787b
So here rollover will be 0xffffffff, when it's expected to be 0
d2787b
or 1 only. When this is compared later with the cached overflow
d2787b
bit, it doesn't match, which prevents updating the cached info.
d2787b
d2787b
This is bad in general, but it's even worse when an entry is not
d2787b
cached and 'rollover' is 0xffffffff the first time. When md-cache
d2787b
doesn't have cached data it assumes it's everything 0. This causes
d2787b
a mismatch, which sends an invalidation request to the kernel, but
d2787b
since the 'rollover' doesn't match, the cached data is not updated.
d2787b
So the next time the cached data is checked, it will also send an
d2787b
invalidation to the kernel, indefinitely.
d2787b
d2787b
This patch fixes two things:
d2787b
d2787b
1. The 'generation' field is made unsigned to avoid sign extension.
d2787b
2. Invalidation requests are only sent if we already had valid cached
d2787b
   data. Otherwise it doesn't make sense to send an invalidation.
d2787b
d2787b
Upstream patch:
d2787b
> Upstream-patch-link: https://github.com/gluster/glusterfs/pull/2619
d2787b
> Fixes: #2617
d2787b
> Change-Id: Ie40e68288cf143e1bc1a40f46da98f51bb2d6864
d2787b
> Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
d2787b
d2787b
BUG: 1904137
d2787b
Change-Id: Ie40e68288cf143e1bc1a40f46da98f51bb2d6864
d2787b
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
d2787b
Reviewed-on: https://code.engineering.redhat.com/gerrit/c/rhs-glusterfs/+/279188
d2787b
Tested-by: RHGS Build Bot <nigelb@redhat.com>
d2787b
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
d2787b
---
d2787b
 xlators/performance/md-cache/src/md-cache.c | 4 ++--
d2787b
 1 file changed, 2 insertions(+), 2 deletions(-)
d2787b
d2787b
diff --git a/xlators/performance/md-cache/src/md-cache.c b/xlators/performance/md-cache/src/md-cache.c
d2787b
index bbbee3b..e0256d6 100644
d2787b
--- a/xlators/performance/md-cache/src/md-cache.c
d2787b
+++ b/xlators/performance/md-cache/src/md-cache.c
d2787b
@@ -79,7 +79,7 @@ struct mdc_conf {
d2787b
     gf_boolean_t cache_statfs;
d2787b
     struct mdc_statfs_cache statfs_cache;
d2787b
     char *mdc_xattr_str;
d2787b
-    gf_atomic_int32_t generation;
d2787b
+    gf_atomic_uint32_t generation;
d2787b
 };
d2787b
 
d2787b
 struct mdc_local;
d2787b
@@ -537,7 +537,7 @@ mdc_inode_iatt_set_validate(xlator_t *this, inode_t *inode, struct iatt *prebuf,
d2787b
             (iatt->ia_mtime_nsec != mdc->md_mtime_nsec) ||
d2787b
             (iatt->ia_ctime != mdc->md_ctime) ||
d2787b
             (iatt->ia_ctime_nsec != mdc->md_ctime_nsec)) {
d2787b
-            if (conf->global_invalidation &&
d2787b
+            if (conf->global_invalidation && mdc->valid &&
d2787b
                 (!prebuf || (prebuf->ia_mtime != mdc->md_mtime) ||
d2787b
                  (prebuf->ia_mtime_nsec != mdc->md_mtime_nsec) ||
d2787b
                  (prebuf->ia_ctime != mdc->md_ctime) ||
d2787b
-- 
d2787b
1.8.3.1
d2787b