From be3448ed5d9d59752cff4df8325ee67eb7d41531 Mon Sep 17 00:00:00 2001 From: Xavi Hernandez Date: Mon, 19 Jul 2021 06:56:18 +0200 Subject: [PATCH 592/610] md-cache: fix integer signedness mismatch md-cache uses a mechanism based on a generation number to detect modifications made by other clients to the entries and invalidate the cached data. This generation number is a 32 bit integer. When it overflows, special management is done to avoid problems. This overflow condition is tracked with a single bit. For many fops, when they are received, the overflow bit and the current generation number are recorded in a single 64-bit value which is used later in the cbk. This is the problematic function: uint64_t __mdc_get_generation(xlator_t *this, struct md_cache *mdc) { uint64_t gen = 0, rollover; struct mdc_conf *conf = NULL; conf = this->private; gen = GF_ATOMIC_INC(conf->generation); if (gen == 0) { gf_log("MDC", GF_LOG_NOTICE, "%p Reset 1", mdc); mdc->gen_rollover = !mdc->gen_rollover; gen = GF_ATOMIC_INC(conf->generation); mdc->ia_time = 0; mdc->generation = 0; mdc->invalidation_time = gen - 1; } rollover = mdc->gen_rollover; gen |= (rollover << 32); return gen; } 'conf->generation' is declared as an atomic signed 32-bit integer, and 'gen' is an unsigned 64-bit value. When 'gen' is assigned from a signed int, the sign bit is extended to fill the high 32 bits of 'gen'. If the counter has overflown the maximum signed positive value, it will become negative (sign bit = 1). In this case, when 'rollover' is later combined with 'gen', all the high bits remain at '1'. This value is used later in 'mdc_inode_iatt_set_validate' during callback processing. The overflow condition and generation numbers from when the operation was received are recovered this way: rollover = incident_time >> 32; incident_time = (incident_time & 0xffffffff); ('incident_time' is the saved value from '__mdc_get_generation'). So here rollover will be 0xffffffff, when it's expected to be 0 or 1 only. When this is compared later with the cached overflow bit, it doesn't match, which prevents updating the cached info. This is bad in general, but it's even worse when an entry is not cached and 'rollover' is 0xffffffff the first time. When md-cache doesn't have cached data it assumes it's everything 0. This causes a mismatch, which sends an invalidation request to the kernel, but since the 'rollover' doesn't match, the cached data is not updated. So the next time the cached data is checked, it will also send an invalidation to the kernel, indefinitely. This patch fixes two things: 1. The 'generation' field is made unsigned to avoid sign extension. 2. Invalidation requests are only sent if we already had valid cached data. Otherwise it doesn't make sense to send an invalidation. Upstream patch: > Upstream-patch-link: https://github.com/gluster/glusterfs/pull/2619 > Fixes: #2617 > Change-Id: Ie40e68288cf143e1bc1a40f46da98f51bb2d6864 > Signed-off-by: Xavi Hernandez BUG: 1904137 Change-Id: Ie40e68288cf143e1bc1a40f46da98f51bb2d6864 Signed-off-by: Xavi Hernandez Reviewed-on: https://code.engineering.redhat.com/gerrit/c/rhs-glusterfs/+/279188 Tested-by: RHGS Build Bot Reviewed-by: Sunil Kumar Heggodu Gopala Acharya --- xlators/performance/md-cache/src/md-cache.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/xlators/performance/md-cache/src/md-cache.c b/xlators/performance/md-cache/src/md-cache.c index bbbee3b..e0256d6 100644 --- a/xlators/performance/md-cache/src/md-cache.c +++ b/xlators/performance/md-cache/src/md-cache.c @@ -79,7 +79,7 @@ struct mdc_conf { gf_boolean_t cache_statfs; struct mdc_statfs_cache statfs_cache; char *mdc_xattr_str; - gf_atomic_int32_t generation; + gf_atomic_uint32_t generation; }; struct mdc_local; @@ -537,7 +537,7 @@ mdc_inode_iatt_set_validate(xlator_t *this, inode_t *inode, struct iatt *prebuf, (iatt->ia_mtime_nsec != mdc->md_mtime_nsec) || (iatt->ia_ctime != mdc->md_ctime) || (iatt->ia_ctime_nsec != mdc->md_ctime_nsec)) { - if (conf->global_invalidation && + if (conf->global_invalidation && mdc->valid && (!prebuf || (prebuf->ia_mtime != mdc->md_mtime) || (prebuf->ia_mtime_nsec != mdc->md_mtime_nsec) || (prebuf->ia_ctime != mdc->md_ctime) || -- 1.8.3.1