From 517d5c245c9805b56f73c7fa0e23e8853fe22da6 Mon Sep 17 00:00:00 2001
From: Artem Savkov <asavkov@redhat.com>
Date: Fri, 21 May 2021 14:20:32 +0200
Subject: [RHEL7.9 KPATCH] CVE-2021-3347 Use after free via PI futex state
Kernels:
3.10.0-1160.el7
3.10.0-1160.2.1.el7
3.10.0-1160.2.2.el7
3.10.0-1160.6.1.el7
3.10.0-1160.11.1.el7
3.10.0-1160.15.2.el7
3.10.0-1160.21.1.el7
3.10.0-1160.24.1.el7
3.10.0-1160.25.1.el7
Changes since last build:
[x86_64]:
futex.o: changed function: do_futex
futex.o: changed function: fixup_owner
futex.o: changed function: fixup_pi_state_owner.isra.16
futex.o: changed function: free_pi_state
futex.o: changed function: futex_lock_pi.isra.20
futex.o: changed function: futex_wait_requeue_pi.constprop.22
futex.o: new function: pi_state_update_owner
[ppc64le]:
futex.o: changed function: do_futex
futex.o: changed function: fixup_owner
futex.o: changed function: fixup_pi_state_owner.isra.9
futex.o: changed function: free_pi_state
futex.o: changed function: futex_lock_pi.isra.16
futex.o: changed function: futex_wait_requeue_pi.constprop.17
futex.o: changed function: unqueue_me_pi
futex.o: new function: pi_state_update_owner
---------------------------
Modifications: added -fno-optimize-sibling-calls to fixup_owner()
commit d2fb2a9cf682bdba4b66103fb079c13a04039430
Author: Donghai Qiao <dqiao@redhat.com>
Date: Thu May 20 16:35:49 2021 -0400
futex: Handle faults correctly for PI futexes
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1935108
Upstream status: 34b1a1ce1458f50ef27c54e28eb9b1947012907a
CVE: CVE-2021-3347
Conflicts:
The original patch is intent to make the state of rtmutex and pi_state consistent
if the kernel is unable to update the user space futex word, rather than unlocking
the rtmutex and leaving pi_state out of synched. As a result, this original fix
removed part of the code which was introduced by 16ffa12d7 ("futex: Pull
rt_mutex_futex_unlock() out from under hb->lock") to the functions futex_lock_pi()
and futex_wait_requeue_pi() to avoid the inconsistency. So the conflicts are related
to the following two commits, though git blame displayed a much longer list which
shows the chain of dependency in the history.
16ffa12d7425 ("futex: Pull rt_mutex_futex_unlock() out from under hb->lock")
c236c8e95a3d ("futex: Fix potential use-after-free in FUTEX_REQUEUE_PI")
commit 34b1a1ce1458f50ef27c54e28eb9b1947012907a
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Mon, 18 Jan 2021 19:01:21 +0100
futex: Handle faults correctly for PI futexes
fixup_pi_state_owner() tries to ensure that the state of the rtmutex,
pi_state and the user space value related to the PI futex are consistent
before returning to user space. In case that the user space value update
faults and the fault cannot be resolved by faulting the page in via
fault_in_user_writeable() the function returns with -EFAULT and leaves
the rtmutex and pi_state owner state inconsistent.
A subsequent futex_unlock_pi() operates on the inconsistent pi_state and
releases the rtmutex despite not owning it which can corrupt the RB tree of
the rtmutex and cause a subsequent kernel stack use after free.
It was suggested to loop forever in fixup_pi_state_owner() if the fault
cannot be resolved, but that results in runaway tasks which is especially
undesired when the problem happens due to a programming error and not due
to malice.
As the user space value cannot be fixed up, the proper solution is to make
the rtmutex and the pi_state consistent so both have the same owner. This
leaves the user space value out of sync. Any subsequent operation on the
futex will fail because the 10th rule of PI futexes (pi_state owner and
user space value are consistent) has been violated.
As a consequence this removes the inept attempts of 'fixing' the situation
in case that the current task owns the rtmutex when returning with an
unresolvable fault by unlocking the rtmutex which left pi_state::owner and
rtmutex::owner out of sync in a different and only slightly less dangerous
way.
Fixes: 1b7558e457ed ("futexes: fix fault handling in futex_lock_pi")
Reported-by: gzobqq@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Donghai Qiao <dqiao@redhat.com>
commit 25077b49b47c1cdf224b54c837172ff820e8be88
Author: Donghai Qiao <dqiao@redhat.com>
Date: Thu May 20 16:30:16 2021 -0400
futex: Provide and use pi_state_update_owner()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1935108
Upstream status: c5cade200ab9a2a3be9e7f32a752c8d86b502ec7
CVE: CVE-2021-3347
Conflicts:
Updating the owner of pi_state requires that we remove the pi_state structure from
the old owner's pi_state_list then add it to the new owner's pi_state_list. Because
this action takes place in multiple occassions in the current upstream futex.c, so
the similar code is duplicated in all these places. The purpose of this patch is to
eliminate these code duplications with a new routine pi_state_update_owner().
The conflicts in 7.9.z are caused by the differences in places where updating owner
takes place. After sorting out the details, the relevant commit IDs as below :
734009e96d19 ("futex: Change locking rules")
b4abf91047cf ("rtmutex: Make wait_lock irq safe")
commit c5cade200ab9a2a3be9e7f32a752c8d86b502ec7
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 19 Jan 2021 15:21:35 +0100
futex: Provide and use pi_state_update_owner()
Updating pi_state::owner is done at several places with the same
code. Provide a function for it and use that at the obvious places.
This is also a preparation for a bug fix to avoid yet another copy of the
same code or alternatively introducing a completely unpenetratable mess of
gotos.
Originally-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Donghai Qiao <dqiao@redhat.com>
commit 69414a50f8bad2063b89981110fb374733209d9d
Author: Donghai Qiao <dqiao@redhat.com>
Date: Wed May 19 14:24:04 2021 -0400
futex: Replace pointless printk in fixup_owner()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1935108
Upstream status: 04b79c55201f02ffd675e1231d731365e335c307
CVE: CVE-2021-3347
commit 04b79c55201f02ffd675e1231d731365e335c307
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 19 Jan 2021 16:06:10 +0100
futex: Replace pointless printk in fixup_owner()
If that unexpected case of inconsistent arguments ever happens then the
futex state is left completely inconsistent and the printk is not really
helpful. Replace it with a warning and make the state consistent.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Donghai Qiao <dqiao@redhat.com>
commit 7e96fb06469c95628039ead2591f82e88af5da10
Author: Donghai Qiao <dqiao@redhat.com>
Date: Wed May 19 14:19:05 2021 -0400
futex: Ensure the correct return value from futex_lock_pi()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1935108
Upstream status: 12bb3f7f1b03d5913b3f9d4236a488aa7774dfe9
CVE: CVE-2021-3347
Conflicts:
This original upstream patch relies heavily on c1e2f0eaf015 ("futex: Avoid
violating the 10th rule of futex") which is one of the upstream commits listed
below. But the backport for c1e2f0eaf015 requires we resolve very complex chain
of dependencies across multiple critical kernel source files therefore the risk
is considered too high for 7.9.z.
Instead of pulling together tons of the relevant commits in to 7.9.z, we just
want to take a light risk approach by digesting the fix 12bb3f7f1b03 ("futex:
Ensure the correct return value from futex_lock_pi()") for 7.9.z. All we need
to do is to make the changed functions fixup_owner() and fixup_pi_state_owner()
of 7.9.z return the required values as this upstream fix suggests in every
circumstance. This way, we can cleanly cut this CVE patch set with merely four
patches, without having to backport tons of patches in the chain of dependency.
Besides, an extra change made to fixup_owner() (see HUNK -2063,13 +2062,11 in
this backport patch) is to eliminate a mistake made by upstream, where the
specification of a local variable "ret" was removed from that function, but
there was still a dereference to "ret" as shown by that HUNK.
16ffa12d7425 ("futex: Pull rt_mutex_futex_unlock() out from under hb->lock")
c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
734009e96d19 ("futex: Change locking rules")
d7c5ed73b19c ("futex: Remove needless goto's")
6b4f4bc9cb22 ("locking/futex: Allow low-level atomic operations to return -EAGAIN")
commit 12bb3f7f1b03d5913b3f9d4236a488aa7774dfe9
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 20 Jan 2021 16:00:24 +0100
futex: Ensure the correct return value from futex_lock_pi()
In case that futex_lock_pi() was aborted by a signal or a timeout and the
task returned without acquiring the rtmutex, but is the designated owner of
the futex due to a concurrent futex_unlock_pi() fixup_owner() is invoked to
establish consistent state. In that case it invokes fixup_pi_state_owner()
which in turn tries to acquire the rtmutex again. If that succeeds then it
does not propagate this success to fixup_owner() and futex_lock_pi()
returns -EINTR or -ETIMEOUT despite having the futex locked.
Return success from fixup_pi_state_owner() in all cases where the current
task owns the rtmutex and therefore the futex and propagate it correctly
through fixup_owner(). Fixup the other callsite which does not expect a
positive return value.
Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Donghai Qiao <dqiao@redhat.com>
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Yannick Cote <ycote@redhat.com>
---
kernel/futex.c | 123 +++++++++++++++++++++++++------------------------
1 file changed, 63 insertions(+), 60 deletions(-)
diff --git a/kernel/futex.c b/kernel/futex.c
index 877831775d7aa..8ec57c357ca58 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -640,6 +640,29 @@ static struct futex_pi_state * alloc_pi_state(void)
return pi_state;
}
+static void pi_state_update_owner(struct futex_pi_state *pi_state,
+ struct task_struct *new_owner)
+{
+ struct task_struct *old_owner = pi_state->owner;
+
+ lockdep_assert_held(&pi_state->pi_mutex.wait_lock);
+
+ if (old_owner) {
+ raw_spin_lock_irq(&old_owner->pi_lock);
+ WARN_ON(list_empty(&pi_state->list));
+ list_del_init(&pi_state->list);
+ raw_spin_unlock_irq(&old_owner->pi_lock);
+ }
+
+ if (new_owner) {
+ raw_spin_lock_irq(&new_owner->pi_lock);
+ WARN_ON(!list_empty(&pi_state->list));
+ list_add(&pi_state->list, &new_owner->pi_state_list);
+ pi_state->owner = new_owner;
+ raw_spin_unlock_irq(&new_owner->pi_lock);
+ }
+}
+
static void free_pi_state(struct futex_pi_state *pi_state)
{
if (!atomic_dec_and_test(&pi_state->refcount))
@@ -650,10 +673,7 @@ static void free_pi_state(struct futex_pi_state *pi_state)
* and has cleaned up the pi_state already
*/
if (pi_state->owner) {
- raw_spin_lock_irq(&pi_state->owner->pi_lock);
- list_del_init(&pi_state->list);
- raw_spin_unlock_irq(&pi_state->owner->pi_lock);
-
+ pi_state_update_owner(pi_state, NULL);
rt_mutex_proxy_unlock(&pi_state->pi_mutex, pi_state->owner);
}
@@ -791,7 +811,8 @@ void exit_pi_state_list(struct task_struct *curr)
* FUTEX_OWNER_DIED bit. See [4]
*
* [10] There is no transient state which leaves owner and user space
- * TID out of sync.
+ * TID out of sync. Except one error case where the kernel is denied
+ * write access to the user address, see fixup_pi_state_owner().
*/
static int
lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
@@ -1168,16 +1189,7 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this)
return ret;
}
- raw_spin_lock_irq(&pi_state->owner->pi_lock);
- WARN_ON(list_empty(&pi_state->list));
- list_del_init(&pi_state->list);
- raw_spin_unlock_irq(&pi_state->owner->pi_lock);
-
- raw_spin_lock_irq(&new_owner->pi_lock);
- WARN_ON(!list_empty(&pi_state->list));
- list_add(&pi_state->list, &new_owner->pi_state_list);
- pi_state->owner = new_owner;
- raw_spin_unlock_irq(&new_owner->pi_lock);
+ pi_state_update_owner(pi_state, new_owner);
raw_spin_unlock(&pi_state->pi_mutex.wait_lock);
rt_mutex_unlock(&pi_state->pi_mutex);
@@ -1953,20 +1965,9 @@ retry:
* We fixed up user space. Now we need to fix the pi_state
* itself.
*/
- if (pi_state->owner != NULL) {
- raw_spin_lock_irq(&pi_state->owner->pi_lock);
- WARN_ON(list_empty(&pi_state->list));
- list_del_init(&pi_state->list);
- raw_spin_unlock_irq(&pi_state->owner->pi_lock);
- }
+ pi_state_update_owner(pi_state, newowner);
- pi_state->owner = newowner;
-
- raw_spin_lock_irq(&newowner->pi_lock);
- WARN_ON(!list_empty(&pi_state->list));
- list_add(&pi_state->list, &newowner->pi_state_list);
- raw_spin_unlock_irq(&newowner->pi_lock);
- return 0;
+ return newowner == current;
/*
* To handle the page fault we need to drop the hash bucket
@@ -1989,10 +1990,26 @@ handle_fault:
* Check if someone else fixed it for us:
*/
if (pi_state->owner != oldowner)
- return 0;
+ return newowner == current;
+
+ if (ret) {
+ /*
+ * fault_in_user_writeable() failed so user state is immutable. At
+ * best we can make the kernel state consistent but user state will
+ * be most likely hosed and any subsequent unlock operation will be
+ * rejected due to PI futex rule [10].
+ *
+ * Ensure that the rtmutex owner is also the pi_state owner despite
+ * the user space value claiming something different. There is no
+ * point in unlocking the rtmutex if current is the owner as it
+ * would need to wait until the next waiter has taken the rtmutex
+ * to guarantee consistent state. Keep it simple. Userspace asked
+ * for this wreckaged state.
+ */
+ pi_state_update_owner(pi_state, rt_mutex_owner(&pi_state->pi_mutex));
- if (ret)
return ret;
+ }
goto retry;
}
@@ -2014,10 +2031,10 @@ static long futex_wait_restart(struct restart_block *restart);
* 0 - success, lock not taken;
* <0 - on error (-EFAULT)
*/
+__attribute__((optimize("-fno-optimize-sibling-calls")))
static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
{
struct task_struct *owner;
- int ret = 0;
if (locked) {
/*
@@ -2025,8 +2042,8 @@ static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
* did a lock-steal - fix up the PI-state in that case:
*/
if (q->pi_state->owner != current)
- ret = fixup_pi_state_owner(uaddr, q, current);
- goto out;
+ return fixup_pi_state_owner(uaddr, q, current);
+ return 1;
}
/*
@@ -2040,8 +2057,7 @@ static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
* rt_mutex waiters list.
*/
if (rt_mutex_trylock(&q->pi_state->pi_mutex)) {
- locked = 1;
- goto out;
+ return 1;
}
/*
@@ -2054,22 +2070,18 @@ static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
if (!owner)
owner = rt_mutex_next_owner(&q->pi_state->pi_mutex);
raw_spin_unlock(&q->pi_state->pi_mutex.wait_lock);
- ret = fixup_pi_state_owner(uaddr, q, owner);
- goto out;
+
+ return fixup_pi_state_owner(uaddr, q, owner);
}
/*
* Paranoia check. If we did not take the lock, then we should not be
- * the owner of the rt_mutex.
+ * the owner of the rt_mutex. Warn and establish consistent state.
*/
- if (rt_mutex_owner(&q->pi_state->pi_mutex) == current)
- printk(KERN_ERR "fixup_owner: ret = %d pi-mutex: %p "
- "pi-state %p\n", ret,
- q->pi_state->pi_mutex.owner,
- q->pi_state->owner);
+ if (WARN_ON_ONCE(rt_mutex_owner(&q->pi_state->pi_mutex) == current))
+ return fixup_pi_state_owner(uaddr, q, current);
-out:
- return ret ? ret : locked;
+ return 0;
}
/**
@@ -2363,13 +2375,6 @@ retry_private:
if (res)
ret = (res < 0) ? res : 0;
- /*
- * If fixup_owner() faulted and was unable to handle the fault, unlock
- * it and return the fault to userspace.
- */
- if (ret && (rt_mutex_owner(&q.pi_state->pi_mutex) == current))
- rt_mutex_unlock(&q.pi_state->pi_mutex);
-
/* Unqueue and drop the lock */
unqueue_me_pi(&q);
@@ -2666,6 +2671,11 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
spin_lock(q.lock_ptr);
ret = fixup_pi_state_owner(uaddr2, &q, current);
spin_unlock(q.lock_ptr);
+ /*
+ * Adjust the return value. It's either -EFAULT or
+ * success (1) but the caller expects 0 for success.
+ */
+ ret = ret < 0 ? ret : 0;
}
} else {
/*
@@ -2695,14 +2705,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
unqueue_me_pi(&q);
}
- /*
- * If fixup_pi_state_owner() faulted and was unable to handle the
- * fault, unlock the rt_mutex and return the fault to userspace.
- */
- if (ret == -EFAULT) {
- if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
- rt_mutex_unlock(pi_mutex);
- } else if (ret == -EINTR) {
+ if (ret == -EINTR) {
/*
* We've already been requeued, but cannot restart by calling
* futex_lock_pi() directly. We could restart this syscall, but
--
2.26.3