Blob Blame History Raw
From 6059352bd38a85ed6823991c598f6f02043d7210 Mon Sep 17 00:00:00 2001
From: Raghavendra G <rgowdapp@redhat.com>
Date: Tue, 27 Sep 2016 16:35:08 +0530
Subject: [PATCH 212/227] performance/write-behind: remove the request from
 liability queue in wb_fulfill_request

Before this patch, a request is removed from liability queue only when
ref count of request hits 0. Though, wb_fulfill_request does an unref,
it need not be the last unref and hence the request may survive in
liability queue till the last unref. Let,

T1: the time at which wb_fulfill_request is invoked
T2: the time at which last unref is done on request

Let's consider a case of T2 > T1. In the time window between T1 and
T2, any other request (waiter) conflicting with request in liability
queue (blocker - basically a write which has been lied) is blocked
from winding. If T2 happens to be when wb_do_unwinds is invoked, no
further processing of request list happens and "waiter" would get
blocked forever. An example imaginary sequence of events is given
below:

1. A write request w1 is picked up for unwinding in __wb_pick_unwinds
   (but unwind is not done _yet_ and hence reference
   remains). However, w1 is moved to liability queue. Let's call this
   invocation of wb_process_queue by wb_writev as PQ1.

2. A flush (f1) request hits write behind. Since the liability queue
   of inode is not empty, f1 is not picked for unwinding. Let's call
   the invocation of wb_process_queue by wb_flush as PQ2.

3. PQ2 continues and picks w1 for fulfilling and invokes
   wb_fulfill. As part of successful wb_fulfill_cbk,
   wb_fulfill_request (w1) is invoked. But, w1 is not freed (and hence
   not removed from liability queue) as w1 is not unwound _yet_ and a
   ref remains (PQ1 has not invoked wb_do_unwinds _yet_).

4. wb_fulfill_cbk (triggered by PQ2) invokes a wb_process_queue (let's
   say PQ3). f1 is not resumed in PQ3 as w1 is still in liability
   queue. At this time, PQ2 and PQ3 are complete.

5. PQ1 continues, unwinds w1 and does last unref on w1 and w1 is freed
   (and removed from liability queue). Since PQ1 didn't invoke
   wb_fulfill on any other write requests, there won't be any future
   codepaths that would invoke wb_process_queue and f1 is stuck
   forever.

With this fix, w1 is removed from liability queue in step 3 above and
PQ3 resumes f1 in step 4 (as there are no requests conflicting with f1
in liability queue during execution of PQ3).

> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
> BUG: 1379655
> Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb
> Reviewed-on: http://review.gluster.org/15579
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>

Change-Id: I1e71d3b6a2dfdbace31b6ee108a4e042e699dca8
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1378131
Reviewed-on: https://code.engineering.redhat.com/gerrit/91956
---
 xlators/performance/write-behind/src/write-behind.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/xlators/performance/write-behind/src/write-behind.c b/xlators/performance/write-behind/src/write-behind.c
index 0cba578..7f5719b 100644
--- a/xlators/performance/write-behind/src/write-behind.c
+++ b/xlators/performance/write-behind/src/write-behind.c
@@ -306,7 +306,11 @@ wb_liability_has_conflict (wb_inode_t *wb_inode, wb_request_t *req)
         wb_request_t *each     = NULL;
 
         list_for_each_entry (each, &wb_inode->liability, lie) {
-		if (wb_requests_conflict (each, req))
+		if (wb_requests_conflict (each, req)
+                    && (!each->ordering.fulfilled))
+                        /* A fulfilled request shouldn't block another
+                         * request (even a dependent one) from winding.
+                         */
 			return each;
         }
 
@@ -667,7 +671,14 @@ __wb_fulfill_request (wb_request_t *req)
 	wb_inode->window_current -= req->total_size;
 	wb_inode->transit -= req->total_size;
 
-	if (!req->ordering.lied) {
+        if (req->ordering.lied) {
+                /* 1. If yes, request is in liability queue and hence can be
+                      safely removed from list.
+                   2. If no, request is in temptation queue and hence should be
+                      left in the queue so that wb_pick_unwinds picks it up
+                */
+                list_del_init (&req->lie);
+        } else {
 		/* TODO: fail the req->frame with error if
 		   necessary
 		*/
-- 
2.9.3