Blob Blame History Raw
From 6d6ec6231b2064653725516ee3afa12a0a28f035 Mon Sep 17 00:00:00 2001
From: Pranith Kumar K <pkarampu@redhat.com>
Date: Wed, 19 Oct 2016 15:50:50 +0530
Subject: [PATCH 142/157] rpc: Fix the race between notification and reconnection

Problem:
There was a hang because unlock on an entry failed with
ENOTCONN.
Client thinks the connection is down where as server thinks
the connection is up.

This is the race we are seeing:
1) Connection from client to the brick disconnects.
2) Saved frames unwind is called which unwinds all
   frames that were wound before disconnect.
3) connection from client to the brick happens and
   setvolume.
4) Disconnect notification for the connection in 1)
   comes now and calls client_rpc_notify() which
   marks the connection to be offline even when the
   connection is up.

This is happening because I/O can retrigger connection
before disconnect notification is sent to the higher
layers in rpc.

Fix:
Notify the higher layers that a disconnect happened and then
go ahead with reconnect logic.

For the logs which point to the information above check:
https://bugzilla.redhat.com/show_bug.cgi?id=1386626#c1

Thanks to Raghavendra G for suggesting the correct fix.

 >BUG: 1386626
 >Change-Id: I3c84ba1f17010bd69049fa88ec5f0ae431f8cda9
 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
 >Reviewed-on: http://review.gluster.org/15681
 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
 >Reviewed-by: Niels de Vos <ndevos@redhat.com>
 >CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
 >Smoke: Gluster Build System <jenkins@build.gluster.org>
 >Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
 >(cherry picked from commit a6b63e11b7758cf1bfcb67985e25ec02845f0995)

BUG: 1385605
Change-Id: Ifa721193c26b70e26b47b7698c077da0ad5f2e1d
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/88109
---
 rpc/rpc-lib/src/rpc-clnt.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/rpc/rpc-lib/src/rpc-clnt.c b/rpc/rpc-lib/src/rpc-clnt.c
index a9e43eb..d946cfc 100644
--- a/rpc/rpc-lib/src/rpc-clnt.c
+++ b/rpc/rpc-lib/src/rpc-clnt.c
@@ -899,6 +899,10 @@ rpc_clnt_notify (rpc_transport_t *trans, void *mydata,
         switch (event) {
         case RPC_TRANSPORT_DISCONNECT:
         {
+                if (clnt->notifyfn)
+                        ret = clnt->notifyfn (clnt, clnt->mydata,
+                                              RPC_CLNT_DISCONNECT, NULL);
+
                 rpc_clnt_connection_cleanup (conn);
 
                 pthread_mutex_lock (&conn->lock);
@@ -922,9 +926,6 @@ rpc_clnt_notify (rpc_transport_t *trans, void *mydata,
                 }
                 pthread_mutex_unlock (&conn->lock);
 
-                if (clnt->notifyfn)
-                        ret = clnt->notifyfn (clnt, clnt->mydata,
-                                              RPC_CLNT_DISCONNECT, NULL);
                 if (unref_clnt)
                         rpc_clnt_ref (clnt);
 
-- 
1.7.1