From 8217d00a0a54457961e7ec7d3afb24e953923c7d Mon Sep 17 00:00:00 2001 From: Ashish Pandey Date: Tue, 13 Mar 2018 14:03:20 +0530 Subject: [PATCH 198/201] cluster/ec: Change default read policy to gfid-hash Problem: Whenever we read data from file over NFS, NFS reads more data then requested and caches it. Based on the stat information it makes sure that the cached/pre-read data is valid or not. Consider 4 + 2 EC volume and all the bricks are on differnt nodes. In EC, with round-robin read policy, reads are sent on different set of data bricks. This way, it balances the read fops to go on all the bricks and avoid heating UP (overloading) same set of bricks. Due to small difference in clock speed, it is possible that we get minor difference for atime, mtime or ctime for different bricks. That might cause a different stat returned to NFS based on which NFS will discard cached/pre-read data which is actually not changed and could be used. Solution: Change read policy for EC as gfid-hash. That will force all the read to go to same set of bricks. >Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84 >BUG: 1554743 >Signed-off-by: Ashish Pandey upstream patch: https://review.gluster.org/#/c/19703/ Change-Id: I43e95717980ca52c228fdcb7863c58bd4d14151c BUG: 1559084 Signed-off-by: Ashish Pandey Reviewed-on: https://code.engineering.redhat.com/gerrit/133746 Tested-by: RHGS Build Bot Reviewed-by: Sunil Kumar Heggodu Gopala Acharya --- tests/basic/ec/ec-read-policy.t | 7 +++---- xlators/cluster/ec/src/ec.c | 2 +- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/tests/basic/ec/ec-read-policy.t b/tests/basic/ec/ec-read-policy.t index e4390aa..fe6fe65 100644 --- a/tests/basic/ec/ec-read-policy.t +++ b/tests/basic/ec/ec-read-policy.t @@ -20,10 +20,9 @@ TEST $CLI volume start $V0 TEST glusterfs --direct-io-mode=yes --entry-timeout=0 --attribute-timeout=0 -s $H0 --volfile-id $V0 $M0 EXPECT_WITHIN $CHILD_UP_TIMEOUT "6" ec_child_up_count $V0 0 #TEST volume operations work fine -EXPECT "round-robin" mount_get_option_value $M0 $V0-disperse-0 read-policy -TEST $CLI volume set $V0 disperse.read-policy gfid-hash -EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "gfid-hash" mount_get_option_value $M0 $V0-disperse-0 read-policy -TEST $CLI volume reset $V0 disperse.read-policy + +EXPECT "gfid-hash" mount_get_option_value $M0 $V0-disperse-0 read-policy +TEST $CLI volume set $V0 disperse.read-policy round-robin EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "round-robin" mount_get_option_value $M0 $V0-disperse-0 read-policy #TEST if the option gives the intended behavior. The way we perform this test diff --git a/xlators/cluster/ec/src/ec.c b/xlators/cluster/ec/src/ec.c index 13ce7fb..bfdca64 100644 --- a/xlators/cluster/ec/src/ec.c +++ b/xlators/cluster/ec/src/ec.c @@ -1447,7 +1447,7 @@ struct volume_options options[] = { .key = {"read-policy" }, .type = GF_OPTION_TYPE_STR, .value = {"round-robin", "gfid-hash"}, - .default_value = "round-robin", + .default_value = "gfid-hash", .description = "inode-read fops happen only on 'k' number of bricks in" " n=k+m disperse subvolume. 'round-robin' selects the read" " subvolume using round-robin algo. 'gfid-hash' selects read" -- 1.8.3.1