a3470f
From a47d863ea4501d3d0daceacb194c9f900cefe1a7 Mon Sep 17 00:00:00 2001
a3470f
From: Kotresh HR <khiremat@redhat.com>
a3470f
Date: Mon, 13 Nov 2017 05:27:50 -0500
a3470f
Subject: [PATCH 119/128] geo-rep: Fix data sync issue during hardlink, rename
a3470f
a3470f
Problem:
a3470f
The data is not getting synced if master witnessed
a3470f
IO as below.
a3470f
a3470f
1. echo "test_data" > f1
a3470f
2. ln f1 f2
a3470f
3. mv f2 f3
a3470f
4. unlink f1
a3470f
a3470f
On master, 'f3' exists with data "test_data" but on
a3470f
slave, only f3 exists with zero byte file without
a3470f
backend gfid link.
a3470f
a3470f
Cause:
a3470f
On master, since 'f2' no longer exists, the hardlink
a3470f
is skipped during processing. Later, on trying to sync
a3470f
rename, since source ('f2') doesn't exist, dst ('f3')
a3470f
is created with same gfid. But in this use case, it
a3470f
succeeds but backend gfid would not have linked as 'f1'
a3470f
exists with the same gfid. So, rsync would fail with
a3470f
ENOENT as backend gfid is not linked with 'f3' and 'f1'
a3470f
is unlinked.
a3470f
a3470f
Fix:
a3470f
On processing rename, if src doesn't exist on slave,
a3470f
don't blindly create dst with same gfid. The gfid
a3470f
needs to be checked, if it exists, hardlink needs
a3470f
to be created instead of mknod.
a3470f
a3470f
Thanks Aravinda for helping in RCA :)
a3470f
a3470f
Upstream Reference:
a3470f
> Patch: https://review.gluster.org/18731
a3470f
> BUG: 1512483
a3470f
a3470f
Change-Id: I5af4f99798ed1bcb297598a4bc796b701d1e0130
a3470f
BUG: 1512496
a3470f
Signed-off-by: Kotresh HR <khiremat@redhat.com>
a3470f
Reviewed-on: https://code.engineering.redhat.com/gerrit/126728
a3470f
Tested-by: RHGS Build Bot <nigelb@redhat.com>
a3470f
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
a3470f
---
a3470f
 geo-replication/syncdaemon/resource.py | 13 +++++++++++--
a3470f
 1 file changed, 11 insertions(+), 2 deletions(-)
a3470f
a3470f
diff --git a/geo-replication/syncdaemon/resource.py b/geo-replication/syncdaemon/resource.py
a3470f
index 22aaf85..5ad5b97 100644
a3470f
--- a/geo-replication/syncdaemon/resource.py
a3470f
+++ b/geo-replication/syncdaemon/resource.py
a3470f
@@ -814,8 +814,17 @@ class Server(object):
a3470f
                             elif not matching_disk_gfid(gfid, en):
a3470f
                                 collect_failure(e, EEXIST, True)
a3470f
                         else:
a3470f
-                            (pg, bname) = entry2pb(en)
a3470f
-                            blob = entry_pack_reg_stat(gfid, bname, e['stat'])
a3470f
+                            slink = os.path.join(pfx, gfid)
a3470f
+                            st = lstat(slink)
a3470f
+                            # don't create multiple entries with same gfid
a3470f
+                            if isinstance(st, int):
a3470f
+                                (pg, bname) = entry2pb(en)
a3470f
+                                blob = entry_pack_reg_stat(gfid, bname,
a3470f
+                                                           e['stat'])
a3470f
+                            else:
a3470f
+                                cmd_ret = errno_wrap(os.link, [slink, en],
a3470f
+                                                    [ENOENT, EEXIST], [ESTALE])
a3470f
+                                collect_failure(e, cmd_ret)
a3470f
                 else:
a3470f
                     st1 = lstat(en)
a3470f
                     if isinstance(st1, int):
a3470f
-- 
a3470f
1.8.3.1
a3470f