From 12642839a58a1e6240b5c3376ee87f783c475f3d Mon Sep 17 00:00:00 2001 From: Honggang Li Date: Tue, 16 Mar 2021 21:12:09 -0400 Subject: [PATCH] rdma-ndd: fix udev racy issue for system with multiple InfiniBand HCAs [ Upstream commit 649d6b8c58fcc8afa809cf874b65b03a5607143c ] After read the system hostname, the function `monitor` calls function `set_rdma_node_desc` to initialize the node description for HCAs had been detected by kernel. For system with multiple InfiniBand HCAs, only the first HCA is guaranteed to be detected by kernel at this point. The systemd udev "add" event for the rest HCAs may be emitted before rdma-ndd listen udev event via function `get_udev_fd`. That means the "add" event will never be sent to rdma-ndd service, as rdma-ndd not ready for udev even yet. So, the node description for those HCAs may not be updated by rdma-ndd during system boot. With this patch, rdma-ndd will initialize the node description after it listen to udev. InfiniBand HCAs detected after the initialization will be handled by udev even. Reported-by: Georg Sauthoff Tested-by: Georg Sauthoff Signed-off-by: Honggang Li Signed-off-by: Nicolas Morey-Chaisemartin --- rdma-ndd/rdma-ndd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/rdma-ndd/rdma-ndd.c b/rdma-ndd/rdma-ndd.c index 418d1de9456b..03d0b79dd7f8 100644 --- a/rdma-ndd/rdma-ndd.c +++ b/rdma-ndd/rdma-ndd.c @@ -254,7 +254,6 @@ static void monitor(bool systemd) } read_hostname(hn_fd, hostname, sizeof(hostname)); - set_rdma_node_desc((const char *)hostname, 1); fds[0].fd = hn_fd; fds[0].events = 0; @@ -269,6 +268,8 @@ static void monitor(bool systemd) if (systemd) sd_notify(0, "READY=1"); + set_rdma_node_desc((const char *)hostname, 1); + while (1) { if (poll(fds, numfds, -1) <= 0) { syslog(LOG_ERR, "Poll %s failed; exiting\n", SYS_HOSTNAME); -- 2.25.4