From ca320beac25f82c0c555799e647a47975a333c28 Mon Sep 17 00:00:00 2001 From: Jan Friesse Date: Tue, 10 Mar 2020 17:49:27 +0100 Subject: [PATCH] votequorum: set wfa status only on startup Previously reload of configuration with enabled wait_for_all result in set of wait_for_all_status which set cluster_is_quorate to 0 but didn't inform the quorum service so votequorum and quorum information may get out of sync. Example is 1 node cluster, which is extended to 3 nodes. Quorum service reports cluster as a quorate (incorrect) and votequorum as not-quorate (correct). Similar behavior happens when extending cluster in general, but some configurations are less incorrect (3->4). Discussed solution was to inform quorum service but that would mean every reload would cause loss of quorum until all nodes would be seen again. Such behaviour is consistent but seems to be a bit too strict. Proposed solution sets wait_for_all_status only on startup and doesn't touch it during reload. This solution fulfills requirement of "cluster will be quorate for the first time only after all nodes have been visible at least once at the same time." because node clears wait_for_all_status only after it sees all other nodes or joins cluster which is quorate. It also solves problem with extending cluster, because when cluster becomes unquorate (1->3) wait_for_all_status is set. Added assert is only for ensure that I haven't missed any case when quorate cluster may become unquorate. Signed-off-by: Jan Friesse Reviewed-by: Christine Caulfield --- exec/votequorum.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/exec/votequorum.c b/exec/votequorum.c index b152425..fb9f1cd 100644 --- a/exec/votequorum.c +++ b/exec/votequorum.c @@ -1009,7 +1009,7 @@ static void are_we_quorate(unsigned int total_votes) "Waiting for all cluster members. " "Current votes: %d expected_votes: %d", total_votes, us->expected_votes); - cluster_is_quorate = 0; + assert(!cluster_is_quorate); return; } update_wait_for_all_status(0); @@ -1547,7 +1547,9 @@ static char *votequorum_readconfig(int runtime) update_ev_barrier(us->expected_votes); update_two_node(); if (wait_for_all) { - update_wait_for_all_status(1); + if (!runtime) { + update_wait_for_all_status(1); + } } else if (wait_for_all_autoset && wait_for_all_status) { /* * Reset wait for all status for consistency when wfa is auto-unset by 2node. -- 1.8.3.1