3604df
From 295fd9d5fa4087fc20b2f3edc343f5fd1da261e4 Mon Sep 17 00:00:00 2001
3604df
From: Ryan Ding <ryan.ding@open-fs.com>
3604df
Date: Thu, 1 Sep 2016 15:40:35 +0800
3604df
Subject: [PATCH 168/206] performance/write-behind: fix flush stuck by former
3604df
 failed writes
3604df
3604df
the issue is happened in this case:
3604df
assume a file is opened with fd1 and fd2.
3604df
1. some WRITE opto fd1 got error, they were add back to 'todo' queue
3604df
   because of those error.
3604df
2. fd2 closed, a FLUSH op is send to write-behind.
3604df
3. FLUSH can not be unwind because it's not a legal waiter for those
3604df
   failed write(as func __wb_request_waiting_on() say). and those failed
3604df
   WRITE also can not be ended if fd1 is not closed. fd2 stuck in close
3604df
   syscall.
3604df
3604df
to resolve this issue, we can change the way we determine 2 requests is
3604df
'conflict': flush/fsync is not conflict with those write that is not
3604df
belonged to them. so __wb_pick_winds() can wind the FLUSH op.
3604df
3604df
below is some information when the stuck issue happen:
3604df
glusterdump logs:
3604df
[xlator.performance.write-behind.wb_inode]
3604df
path=/ltp-F9eG0ZSOME/rw-buffered-16436
3604df
inode=0x7fdbe8039b9c
3604df
window_conf=1048576
3604df
window_current=249856
3604df
transit-size=0
3604df
dontsync=0
3604df
3604df
[.WRITE]
3604df
request-ptr=0x7fdbe8020200
3604df
refcount=1
3604df
wound=no
3604df
generation-number=4
3604df
req->op_ret=-1
3604df
req->op_errno=116
3604df
sync-attempts=3
3604df
sync-in-progress=no
3604df
size=131072
3604df
offset=1220608
3604df
lied=-1
3604df
append=0
3604df
fulfilled=0
3604df
go=0
3604df
3604df
[.WRITE]
3604df
request-ptr=0x7fdbe8068c30
3604df
refcount=1
3604df
wound=no
3604df
generation-number=5
3604df
req->op_ret=-1
3604df
req->op_errno=116
3604df
sync-attempts=2
3604df
sync-in-progress=no
3604df
size=118784
3604df
offset=1351680
3604df
lied=-1
3604df
append=0
3604df
fulfilled=0
3604df
go=0
3604df
3604df
[.FLUSH]
3604df
request-ptr=0x7fdbe8021cd0
3604df
refcount=1
3604df
wound=no
3604df
generation-number=6
3604df
req->op_ret=0
3604df
req->op_errno=0
3604df
sync-attempts=0
3604df
3604df
gdb detail about above 3 requests:
3604df
(gdb) print *((wb_request_t *)0x7fdbe8021cd0)
3604df
$2 = {all = {next = 0x7fdbe803a608, prev = 0x7fdbe8068c30}, todo = {next
3604df
= 0x7fdbe803a618, prev = 0x7fdbe8068c40}, lie = {next = 0x7fdbe8021cf0,
3604df
    prev = 0x7fdbe8021cf0}, winds = {next = 0x7fdbe8021d00, prev =
3604df
0x7fdbe8021d00}, unwinds = {next = 0x7fdbe8021d10, prev =
3604df
0x7fdbe8021d10}, wip = {
3604df
    next = 0x7fdbe8021d20, prev = 0x7fdbe8021d20}, stub =
3604df
0x7fdbe80224dc, write_size = 0, orig_size = 0, total_size = 0, op_ret =
3604df
0, op_errno = 0,
3604df
  refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_FLUSH, lk_owner
3604df
= {len = 8, data = "W\322T\f\271\367y$", '\000' <repeats 1015 times>},
3604df
  iobref = 0x0, gen = 6, fd = 0x7fdbe800f0dc, wind_count = 0, ordering =
3604df
{size = 0, off = 0, append = 0, tempted = 0, lied = 0, fulfilled = 0,
3604df
    go = 0}}
3604df
(gdb) print *((wb_request_t *)0x7fdbe8020200)
3604df
$3 = {all = {next = 0x7fdbe8068c30, prev = 0x7fdbe803a608}, todo = {next
3604df
= 0x7fdbe8068c40, prev = 0x7fdbe803a618}, lie = {next = 0x7fdbe8068c50,
3604df
    prev = 0x7fdbe803a628}, winds = {next = 0x7fdbe8020230, prev =
3604df
0x7fdbe8020230}, unwinds = {next = 0x7fdbe8020240, prev =
3604df
0x7fdbe8020240}, wip = {
3604df
    next = 0x7fdbe8020250, prev = 0x7fdbe8020250}, stub =
3604df
0x7fdbe8062c3c, write_size = 131072, orig_size = 4096, total_size = 0,
3604df
op_ret = -1,
3604df
  op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop =
3604df
GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>},
3604df
  iobref = 0x7fdbe80311a0, gen = 4, fd = 0x7fdbe805c89c, wind_count = 3,
3604df
ordering = {size = 131072, off = 1220608, append = 0, tempted = -1,
3604df
    lied = -1, fulfilled = 0, go = 0}}
3604df
(gdb) print *((wb_request_t *)0x7fdbe8068c30)
3604df
$4 = {all = {next = 0x7fdbe8021cd0, prev = 0x7fdbe8020200}, todo = {next
3604df
= 0x7fdbe8021ce0, prev = 0x7fdbe8020210}, lie = {next = 0x7fdbe803a628,
3604df
    prev = 0x7fdbe8020220}, winds = {next = 0x7fdbe8068c60, prev =
3604df
0x7fdbe8068c60}, unwinds = {next = 0x7fdbe8068c70, prev =
3604df
0x7fdbe8068c70}, wip = {
3604df
    next = 0x7fdbe8068c80, prev = 0x7fdbe8068c80}, stub =
3604df
0x7fdbe806746c, write_size = 118784, orig_size = 4096, total_size = 0,
3604df
op_ret = -1,
3604df
  op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop =
3604df
GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>},
3604df
  iobref = 0x7fdbe8052b10, gen = 5, fd = 0x7fdbe805c89c, wind_count = 2,
3604df
ordering = {size = 118784, off = 1351680, append = 0, tempted = -1,
3604df
    lied = -1, fulfilled = 0, go = 0}}
3604df
3604df
you can see they are all on 'todo' queue, and FLUSH op fd is not the
3604df
same WRITE op fd.
3604df
3604df
> Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab
3604df
> BUG: 1372211
3604df
> Signed-off-by: Ryan Ding <ryan.ding@open-fs.com>
3604df
3604df
>Reviewed-on: http://review.gluster.org/15761
3604df
>Tested-by: Raghavendra G <rgowdapp@redhat.com>
3604df
>Smoke: Gluster Build System <jenkins@build.gluster.org>
3604df
>Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
3604df
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
3604df
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
3604df
3604df
Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab
3604df
BUG: 1390843
3604df
Signed-off-by: Ryan Ding <ryan.ding@open-fs.com>
3604df
Reviewed-on: https://code.engineering.redhat.com/gerrit/90346
3604df
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
3604df
Tested-by: Atin Mukherjee <amukherj@redhat.com>
3604df
---
3604df
 xlators/performance/write-behind/src/write-behind.c | 4 ++++
3604df
 1 file changed, 4 insertions(+)
3604df
3604df
diff --git a/xlators/performance/write-behind/src/write-behind.c b/xlators/performance/write-behind/src/write-behind.c
3604df
index c47b537..0cba578 100644
3604df
--- a/xlators/performance/write-behind/src/write-behind.c
3604df
+++ b/xlators/performance/write-behind/src/write-behind.c
3604df
@@ -280,6 +280,10 @@ wb_requests_conflict (wb_request_t *lie, wb_request_t *req)
3604df
 		   us in the todo list */
3604df
 		return _gf_false;
3604df
 
3604df
+        /* requests from different fd do not conflict with each other. */
3604df
+        if (req->fd && (req->fd != lie->fd))
3604df
+                return _gf_false;
3604df
+
3604df
 	if (lie->ordering.append)
3604df
 		/* all modifications wait for the completion
3604df
 		   of outstanding append */
3604df
-- 
3604df
2.9.3
3604df