From 48136328150bc587178091b5766bda382158cb6c Mon Sep 17 00:00:00 2001
From: Nir Soffer <nsoffer@redhat.com>
Date: Sat, 23 Oct 2021 00:08:31 +0300
Subject: [PATCH] lib/poll.c: Retry poll after EINTR
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
I see a rare random failure when calling BlockStatus via Go binding:
block_status: nbd_block_status: poll: Interrupted system call
I could not reproduce this with "nbdinfo --map", even after modifying it
to call nbd_block_status() for every 128 MiB.
Fixing this in nbd_unlock_poll() avoids this issue in the entire
library, when we wait for command completion. This seems more useful
that fixing it in all libnbd clients.
Tested using a go client listing all extents in an image, calling
BlockStatus for every 128m with fedora 34 qcow2 image. Without this fix,
this was always failing.
$ hyperfine -r1000 --show-output "./client nbd+unix://?socket=/tmp/nbd.sock > /dev/null"
Benchmark 1: ./client nbd+unix://?socket=/tmp/nbd.sock > /dev/null
Time (mean ± σ): 31.6 ms ± 3.1 ms [User: 8.8 ms, System: 7.2 ms]
Range (min … max): 26.1 ms … 52.3 ms 1000 runs
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
(cherry picked from commit b3440853cdeca0e44ad9c526e71faaa6cf344bfc)
---
lib/poll.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/lib/poll.c b/lib/poll.c
index edfcc59..df01d94 100644
--- a/lib/poll.c
+++ b/lib/poll.c
@@ -57,8 +57,11 @@ nbd_unlocked_poll (struct nbd_handle *h, int timeout)
* would allow other threads to close file descriptors which we have
* passed to poll.
*/
- r = poll (fds, 1, timeout);
- debug (h, "poll end: r=%d revents=%x", r, fds[0].revents);
+ do {
+ r = poll (fds, 1, timeout);
+ debug (h, "poll end: r=%d revents=%x", r, fds[0].revents);
+ } while (r == -1 && errno == EINTR);
+
if (r == -1) {
set_error (errno, "poll");
return -1;
--
2.31.1