cryptospore / rpms / qemu-kvm

Forked from rpms/qemu-kvm 2 years ago
Clone
Blob Blame History Raw
From 8727e4904e7a6588e39f231d837f4527f265e47e Mon Sep 17 00:00:00 2001
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Date: Tue, 5 May 2020 16:35:59 +0100
Subject: [PATCH 8/9] virtiofsd: only retain file system capabilities

RH-Author: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: <20200505163600.22956-7-dgilbert@redhat.com>
Patchwork-id: 96272
O-Subject: [RHEL-AV-8.2.1 qemu-kvm PATCH 6/7] virtiofsd: only retain file system capabilities
Bugzilla: 1817445
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Max Reitz <mreitz@redhat.com>
RH-Acked-by: Michael S. Tsirkin <mst@redhat.com>

From: Stefan Hajnoczi <stefanha@redhat.com>

virtiofsd runs as root but only needs a subset of root's Linux
capabilities(7).  As a file server its purpose is to create and access
files on behalf of a client.  It needs to be able to access files with
arbitrary uid/gid owners.  It also needs to be create device nodes.

Introduce a Linux capabilities(7) whitelist and drop all capabilities
that we don't need, making the virtiofsd process less powerful than a
regular uid root process.

  # cat /proc/PID/status
  ...
          Before           After
  CapInh: 0000000000000000 0000000000000000
  CapPrm: 0000003fffffffff 00000000880000df
  CapEff: 0000003fffffffff 00000000880000df
  CapBnd: 0000003fffffffff 0000000000000000
  CapAmb: 0000000000000000 0000000000000000

Note that file capabilities cannot be used to achieve the same effect on
the virtiofsd executable because mount is used during sandbox setup.
Therefore we drop capabilities programmatically at the right point
during startup.

This patch only affects the sandboxed child process.  The parent process
that sits in waitpid(2) still has full root capabilities and will be
addressed in the next patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200416164907.244868-2-stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit a59feb483b8fae24d043569ccfcc97ea23d54a02)
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
---
 tools/virtiofsd/passthrough_ll.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 614ba55..6358874 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -2723,6 +2723,43 @@ static void setup_mounts(const char *source)
 }
 
 /*
+ * Only keep whitelisted capabilities that are needed for file system operation
+ */
+static void setup_capabilities(void)
+{
+    pthread_mutex_lock(&cap.mutex);
+    capng_restore_state(&cap.saved);
+
+    /*
+     * Whitelist file system-related capabilities that are needed for a file
+     * server to act like root.  Drop everything else like networking and
+     * sysadmin capabilities.
+     *
+     * Exclusions:
+     * 1. CAP_LINUX_IMMUTABLE is not included because it's only used via ioctl
+     *    and we don't support that.
+     * 2. CAP_MAC_OVERRIDE is not included because it only seems to be
+     *    used by the Smack LSM.  Omit it until there is demand for it.
+     */
+    capng_setpid(syscall(SYS_gettid));
+    capng_clear(CAPNG_SELECT_BOTH);
+    capng_updatev(CAPNG_ADD, CAPNG_PERMITTED | CAPNG_EFFECTIVE,
+            CAP_CHOWN,
+            CAP_DAC_OVERRIDE,
+            CAP_DAC_READ_SEARCH,
+            CAP_FOWNER,
+            CAP_FSETID,
+            CAP_SETGID,
+            CAP_SETUID,
+            CAP_MKNOD,
+            CAP_SETFCAP);
+    capng_apply(CAPNG_SELECT_BOTH);
+
+    cap.saved = capng_save_state();
+    pthread_mutex_unlock(&cap.mutex);
+}
+
+/*
  * Lock down this process to prevent access to other processes or files outside
  * source directory.  This reduces the impact of arbitrary code execution bugs.
  */
@@ -2732,6 +2769,7 @@ static void setup_sandbox(struct lo_data *lo, struct fuse_session *se,
     setup_namespaces(lo, se);
     setup_mounts(lo->source);
     setup_seccomp(enable_syslog);
+    setup_capabilities();
 }
 
 /* Set the maximum number of open file descriptors */
-- 
1.8.3.1