From a8a37a6d612e710f4c63c21e83529e6c9ebd7806 Mon Sep 17 00:00:00 2001 Message-Id: From: Michal Privoznik Date: Fri, 11 Oct 2013 17:18:10 +0200 Subject: [PATCH] qemu_migration: Avoid crashing if domain dies too quickly https://bugzilla.redhat.com/show_bug.cgi?id=1018267 I've noticed a SIGSEGV-ing libvirtd on the destination when the qemu died too quickly = in Prepare phase. What is happening here is: 1) [Thread 3493] We are in qemuMigrationPrepareAny() and calling qemuProcessStart() which subsequently calls qemuProcessWaitForMonitor() and qemuConnectMonitor(). So far so good. The qemuMonitorOpen() succeeds, however switching monitor to QMP mode fails as qemu died meanwhile. That is qemuMonitorSetCapabilities() returns -1. 2013-10-08 15:54:10.629+0000: 3493: debug : qemuMonitorSetCapabilities:1356 : mon=0x14a53da0 2013-10-08 15:54:10.630+0000: 3493: debug : qemuMonitorJSONCommandWithFd:262 : Send command '{"execute":"qmp_capabilities","id":"libvirt-1"}' for write with FD -1 2013-10-08 15:54:10.630+0000: 3493: debug : virEventPollUpdateHandle:147 : EVENT_POLL_UPDATE_HANDLE: watch=17 events=13 ... 2013-10-08 15:54:10.631+0000: 3493: debug : qemuMonitorSend:956 : QEMU_MONITOR_SEND_MSG: mon=0x14a53da0 msg={"execute":"qmp_capabilities","id":"libvirt-1"} fd=-1 2013-10-08 15:54:10.631+0000: 3262: debug : virEventPollRunOnce:641 : Poll got 1 event(s) 2) [Thread 3262] The event loop is trying to do the talking to monitor. However, qemu is dead already, remember? 2013-10-08 15:54:13.436+0000: 3262: error : qemuMonitorIORead:551 : Unable to read from monitor: Connection reset by peer 2013-10-08 15:54:13.516+0000: 3262: debug : virFileClose:90 : Closed fd 25 ... 2013-10-08 15:54:13.533+0000: 3493: debug : qemuMonitorSend:968 : Send command resulted in error internal error: early end of file from monitor: possible problem: 3) [Thread 3493] qemuProcessStart() failed. No big deal. Go to the 'endjob' label and subsequently to the 'cleanup'. Since the domain is not persistent and ret is -1, the qemuDomainRemoveInactive() is called. This has an (unpleasant) effect of virObjectUnref()-in the @vm object. Unpleasant because the event loop which is about to trigger EOF callback still holds a pointer to the @vm (not the reference). See the valgrind output below. 4) [Thread 3262] So the event loop starts triggering EOF: 2013-10-08 15:54:13.542+0000: 3262: debug : qemuMonitorIO:729 : Triggering EOF callback 2013-10-08 15:54:13.543+0000: 3262: debug : qemuProcessHandleMonitorEOF:294 : Received EOF on 0x14549110 'migt10' And the monitor is cleaned up. This results in calling qemuProcessHandleMonitorEOF with the @vm pointer passed. The pointer is kept in qemuMonitor struct. ==3262== Thread 1: ==3262== Invalid read of size 4 ==3262== at 0x77ECCAA: pthread_mutex_lock (in /lib64/libpthread-2.15.so) ==3262== by 0x52FAA06: virMutexLock (virthreadpthread.c:85) ==3262== by 0x52E3891: virObjectLock (virobject.c:320) ==3262== by 0x11626743: qemuProcessHandleMonitorEOF (qemu_process.c:296) ==3262== by 0x11642593: qemuMonitorIO (qemu_monitor.c:730) ==3262== by 0x52BD526: virEventPollDispatchHandles (vireventpoll.c:501) ==3262== by 0x52BDD49: virEventPollRunOnce (vireventpoll.c:648) ==3262== by 0x52BBC68: virEventRunDefaultImpl (virevent.c:274) ==3262== by 0x542D3D9: virNetServerRun (virnetserver.c:1112) ==3262== by 0x11F368: main (libvirtd.c:1513) ==3262== Address 0x14549128 is 24 bytes inside a block of size 136 free'd ==3262== at 0x4C2AF5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==3262== by 0x529B1FF: virFree (viralloc.c:580) ==3262== by 0x52E3703: virObjectUnref (virobject.c:270) ==3262== by 0x531557E: virDomainObjListRemove (domain_conf.c:2355) ==3262== by 0x1160E899: qemuDomainRemoveInactive (qemu_domain.c:2061) ==3262== by 0x1163A0C6: qemuMigrationPrepareAny (qemu_migration.c:2450) ==3262== by 0x1163A923: qemuMigrationPrepareDirect (qemu_migration.c:2626) ==3262== by 0x11682D71: qemuDomainMigratePrepare3Params (qemu_driver.c:10309) ==3262== by 0x53B0976: virDomainMigratePrepare3Params (libvirt.c:7266) ==3262== by 0x1502D3: remoteDispatchDomainMigratePrepare3Params (remote.c:4797) ==3262== by 0x12DECA: remoteDispatchDomainMigratePrepare3ParamsHelper (remote_dispatch.h:5741) ==3262== by 0x54322EB: virNetServerProgramDispatchCall (virnetserverprogram.c:435) The mon->vm is set in qemuMonitorOpenInternal() which is the correct place to increase @vm ref counter. The correct place to decrease the ref counter is then qemuMonitorDispose(). Signed-off-by: Michal Privoznik (cherry picked from commit c7ac2519b7fe8b4f9c0ecc05cf7e46dea64b5e15) Conflicts: src/qemu/qemu_capabilities.c: Context src/qemu/qemu_monitor.c: Context Both due to 809ee6ba not being backported yet. Signed-off-by: Jiri Denemark --- src/qemu/qemu_capabilities.c | 14 ++++++++++---- src/qemu/qemu_monitor.c | 4 +++- 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 9b1d9f5..b95a984 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2482,7 +2482,8 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, char *monpath = NULL; char *pidfile = NULL; pid_t pid = 0; - virDomainObj vm; + virDomainObjPtr vm = NULL; + virDomainXMLOptionPtr xmlopt = NULL; /* the ".sock" sufix is important to avoid a possible clash with a qemu * domain called "capabilities" @@ -2545,10 +2546,13 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCaps, goto cleanup; } - memset(&vm, 0, sizeof(vm)); - vm.pid = pid; + if (!(xmlopt = virDomainXMLOptionNew(NULL, NULL, NULL)) || + !(vm = virDomainObjNew(xmlopt))) + goto cleanup; + + vm->pid = pid; - if (!(mon = qemuMonitorOpen(&vm, &config, true, &callbacks))) { + if (!(mon = qemuMonitorOpen(vm, &config, true, &callbacks))) { ret = 0; goto cleanup; } @@ -2629,6 +2633,8 @@ cleanup: VIR_FREE(monarg); VIR_FREE(monpath); VIR_FREE(package); + virObjectUnref(vm); + virObjectUnref(xmlopt); if (pid != 0) { char ebuf[1024]; diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 5b2fb04..e22a2b2 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -249,6 +249,8 @@ static void qemuMonitorDispose(void *obj) VIR_DEBUG("mon=%p", mon); if (mon->cb && mon->cb->destroy) (mon->cb->destroy)(mon, mon->vm); + virObjectUnref(mon->vm); + virCondDestroy(&mon->notify); VIR_FREE(mon->buffer); virJSONValueFree(mon->options); @@ -722,7 +724,7 @@ qemuMonitorOpenInternal(virDomainObjPtr vm, } mon->fd = fd; mon->hasSendFD = hasSendFD; - mon->vm = vm; + mon->vm = virObjectRef(vm); mon->json = json; if (json) mon->waitGreeting = true; -- 1.8.3.2