Blob Blame History Raw
From 9f78a1dc55745042ecaed019ad0fc58d9e80d85e Mon Sep 17 00:00:00 2001
From: Ravishankar N <ravishankar@redhat.com>
Date: Fri, 14 Oct 2016 16:09:08 +0530
Subject: [PATCH 102/141] afr: Take full locks in arbiter only for data transactions

Patch in master: http://review.gluster.org/#/c/15641/
Patch in release-3.8: http://review.gluster.org/#/c/15647/
Patch in release-3.9: http://review.gluster.org/#/c/15648/

Problem:
Sharding exposed a bug in arbiter config. where `dd` throughput was
extremely slow. Shard xlator was sending a fxattrop to update the file
size immediately after a writev. Arbiter was incorrectly over-riding the
LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop,
causing the inodelk to be taken on the data domain. And since the
preceeding writev hadn't released the lock (afr does a 'lazy'
unlock if write succeeds on all bricks), this degraded to a blocking
lock causing extra lock/unlock calls and delays.

Fix:
Modify flock.l_len and flock.l_start to take full locks only for data
transactions.

Change-Id: If9ca04b7a27e42ddedf185c4c36689ab53b39d54
BUG: 1380276
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/87216
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
---
 xlators/cluster/afr/src/afr-transaction.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/xlators/cluster/afr/src/afr-transaction.c b/xlators/cluster/afr/src/afr-transaction.c
index 27be045..856ea6e 100644
--- a/xlators/cluster/afr/src/afr-transaction.c
+++ b/xlators/cluster/afr/src/afr-transaction.c
@@ -1900,7 +1900,8 @@ afr_set_transaction_flock (xlator_t *this, afr_local_t *local)
         inodelk = afr_get_inodelk (int_lock, int_lock->domain);
         priv = this->private;
 
-        if (priv->arbiter_count) {
+        if (priv->arbiter_count &&
+            local->transaction.type == AFR_DATA_TRANSACTION) {
                 /*Lock entire file to avoid network split brains.*/
                 inodelk->flock.l_len   = 0;
                 inodelk->flock.l_start = 0;
-- 
1.7.1