Blob Blame History Raw
commit 164c03d33bd9d58844921888560baf3f156a1f05
Author: Christophe Fergeau <cfergeau@redhat.com>
Date:   Mon Sep 10 12:17:07 2012 +0200

    Fix unwanted closing of libvirt client connection
    
    e5a1bee07 introduced a regression in Boxes: when Boxes is left idle
    (it's still doing some libvirt calls in the background), the
    libvirt connection gets closed after a few minutes. What happens is
    that this code in virNetClientIOHandleOutput gets triggered:
    
    if (!thecall)
        return -1; /* Shouldn't happen, but you never know... */
    
    and after the changes in e5a1bee07, this causes the libvirt connection
    to be closed.
    
    Upon further investigation, what happens is that
    virNetClientIOHandleOutput is called from gvir_event_handle_dispatch
    in libvirt-glib, which is triggered because the client fd became
    writable. However, between the times gvir_event_handle_dispatch
    is called, and the time the client lock is grabbed and
    virNetClientIOHandleOutput is called, another thread runs and
    completes the current call. 'thecall' is then NULL when the first
    thread gets to run virNetClientIOHandleOutput.
    
    After describing this situation on IRC, danpb suggested this:
    
    11:37 < danpb> In that case I think the correct thing would be to change
                   'return -1' above to 'return 0' since that's not actually an
                   error - its a rare, but expected event
    
    which is what this patch is doing. I've tested it against master
    libvirt, and I didn't get disconnected in ~10 minutes while this
    happens in less than 5 minutes without this patch.

diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c
index 43a9814..727ed67 100644
--- a/src/rpc/virnetclient.c
+++ b/src/rpc/virnetclient.c
@@ -1205,7 +1205,10 @@ virNetClientIOHandleOutput(virNetClientPtr client)
         thecall = thecall->next;
 
     if (!thecall)
-        return -1; /* Shouldn't happen, but you never know... */
+        return 0; /* This can happen if another thread raced with us and
+                   * completed the call between the time this thread woke
+                   * up from poll()ing and the time we locked the client
+                   */
 
     while (thecall) {
         ssize_t ret = virNetClientIOWriteMessage(client, thecall);