Blame SOURCES/bz2024658-1-totemsrp-Switch-totempg-buffers-at-the-right-time.patch

f27531
From e7a82370a7b5d3ca342d5e42e25763fa2c938739 Mon Sep 17 00:00:00 2001
f27531
From: Jan Friesse <jfriesse@redhat.com>
f27531
Date: Tue, 26 Oct 2021 18:17:59 +0200
f27531
Subject: [PATCH] totemsrp: Switch totempg buffers at the right time
f27531
f27531
Commit 92e0f9c7bb9b4b6a0da8d64bdf3b2e47ae55b1cc added switching of
f27531
totempg buffers in sync phase. But because buffers got switch too early
f27531
there was a problem when delivering recovered messages (messages got
f27531
corrupted and/or lost). Solution is to switch buffers after recovered
f27531
messages got delivered.
f27531
f27531
I think it is worth to describe complete history with reproducers so it
f27531
doesn't get lost.
f27531
f27531
It all started with 402638929e5045ef520a7339696c687fbed0b31b (more info
f27531
about original problem is described in
f27531
https://bugzilla.redhat.com/show_bug.cgi?id=820821). This patch
f27531
solves problem which is way to be reproduced with following reproducer:
f27531
- 2 nodes
f27531
- Both nodes running corosync and testcpg
f27531
- Pause node 1 (SIGSTOP of corosync)
f27531
- On node 1, send some messages by testcpg
f27531
  (it's not answering but this doesn't matter). Simply hit ENTER key
f27531
  few times is enough)
f27531
- Wait till node 2 detects that node 1 left
f27531
- Unpause node 1 (SIGCONT of corosync)
f27531
f27531
and on node 1 newly mcasted cpg messages got sent before sync barrier,
f27531
so node 2 logs "Unknown node -> we will not deliver message".
f27531
f27531
Solution was to add switch of totemsrp new messages buffer.
f27531
f27531
This patch was not enough so new one
f27531
(92e0f9c7bb9b4b6a0da8d64bdf3b2e47ae55b1cc) was created. Reproducer of
f27531
problem was similar, just cpgverify was used instead of testcpg.
f27531
Occasionally when node 1 was unpaused it hang in sync phase because
f27531
there was a partial message in totempg buffers. New sync message had
f27531
different frag cont so it was thrown away and never delivered.
f27531
f27531
After many years problem was found which is solved by this patch
f27531
(original issue describe in
f27531
https://github.com/corosync/corosync/issues/660).
f27531
Reproducer is more complex:
f27531
- 2 nodes
f27531
- Node 1 is rate-limited (used script on the hypervisor side):
f27531
  ```
f27531
  iface=tapXXXX
f27531
  # ~0.1MB/s in bit/s
f27531
  rate=838856
f27531
  # 1mb/s
f27531
  burst=1048576
f27531
  tc qdisc add dev $iface root handle 1: htb default 1
f27531
  tc class add dev $iface parent 1: classid 1:1 htb rate ${rate}bps \
f27531
    burst ${burst}b
f27531
  tc qdisc add dev $iface handle ffff: ingress
f27531
  tc filter add dev $iface parent ffff: prio 50 basic police rate \
f27531
    ${rate}bps burst ${burst}b mtu 64kb "drop"
f27531
  ```
f27531
- Node 2 is running corosync and cpgverify
f27531
- Node 1 keeps restarting of corosync and running cpgverify in cycle
f27531
  - Console 1: while true; do corosync; sleep 20; \
f27531
      kill $(pidof corosync); sleep 20; done
f27531
  - Console 2: while true; do ./cpgverify;done
f27531
f27531
And from time to time (reproduced usually in less than 5 minutes)
f27531
cpgverify reports corrupted message.
f27531
f27531
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
f27531
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
f27531
---
f27531
 exec/totemsrp.c | 16 +++++++++++++++-
f27531
 1 file changed, 15 insertions(+), 1 deletion(-)
f27531
f27531
diff --git a/exec/totemsrp.c b/exec/totemsrp.c
f27531
index d24b11fa..fd71771b 100644
f27531
--- a/exec/totemsrp.c
f27531
+++ b/exec/totemsrp.c
f27531
@@ -1989,13 +1989,27 @@ static void memb_state_operational_enter (struct totemsrp_instance *instance)
f27531
 		trans_memb_list_totemip, instance->my_trans_memb_entries,
f27531
 		left_list, instance->my_left_memb_entries,
f27531
 		0, 0, &instance->my_ring_id);
f27531
+	/*
f27531
+	 * Switch new totemsrp messages queue. Messages sent from now on are stored
f27531
+	 * in different queue so synchronization messages are delivered first. Totempg
f27531
+	 * buffers will be switched later.
f27531
+	 */
f27531
 	instance->waiting_trans_ack = 1;
f27531
-	instance->totemsrp_waiting_trans_ack_cb_fn (1);
f27531
 
f27531
 // TODO we need to filter to ensure we only deliver those
f27531
 // messages which are part of instance->my_deliver_memb
f27531
 	messages_deliver_to_app (instance, 1, instance->old_ring_state_high_seq_received);
f27531
 
f27531
+	/*
f27531
+	 * Switch totempg buffers. This used to be right after
f27531
+	 *   instance->waiting_trans_ack = 1;
f27531
+	 * line. This was causing problem, because there may be not yet
f27531
+	 * processed parts of messages in totempg buffers.
f27531
+	 * So when buffers were switched and recovered messages
f27531
+	 * got delivered it was not possible to assemble them.
f27531
+	 */
f27531
+	instance->totemsrp_waiting_trans_ack_cb_fn (1);
f27531
+
f27531
 	instance->my_aru = aru_save;
f27531
 
f27531
 	/*
f27531
-- 
f27531
2.27.0
f27531