Blob Blame History Raw
From 721a55126f8143f86c73868bde460a3304c85c81 Mon Sep 17 00:00:00 2001
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Date: Thu, 27 Jul 2017 12:06:55 +0200
Subject: [PATCH 07/17] migration/rdma: Fix race on source

RH-Author: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: <20170727120659.8640-2-dgilbert@redhat.com>
Patchwork-id: 75859
O-Subject: [Pegas-1.0 qemu-kvm PATCH 1/5] migration/rdma: Fix race on source
Bugzilla: 1475751
RH-Acked-by: Peter Xu <peterx@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Fix a race where the destination might try and send the source a
WRID_READY before the source has done a post-recv for it.

rdma_post_recv has to happen after the qp exists, and we're
OK since we've already called qemu_rdma_source_init that calls
qemu_alloc_qp.

This corresponds to:
https://bugzilla.redhat.com/show_bug.cgi?id=1285044

The race can be triggered by adding a few ms wait before this
post_recv_control (which was originally due to me turning on loads of
debug).

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20170717110936.23314-2-dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 9cf2bab2edca1e651eef49f2417f8f67bdfe49bb)
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
---
 migration/rdma.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 674ccab..0f6669e 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2363,6 +2363,12 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error **errp)
 
     caps_to_network(&cap);
 
+    ret = qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY);
+    if (ret) {
+        ERROR(errp, "posting second control recv");
+        goto err_rdma_source_connect;
+    }
+
     ret = rdma_connect(rdma->cm_id, &conn_param);
     if (ret) {
         perror("rdma_connect");
@@ -2403,12 +2409,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error **errp)
 
     rdma_ack_cm_event(cm_event);
 
-    ret = qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY);
-    if (ret) {
-        ERROR(errp, "posting second control recv!");
-        goto err_rdma_source_connect;
-    }
-
     rdma->control_ready_expected = 1;
     rdma->nb_sent = 0;
     return 0;
-- 
1.8.3.1