From d899fb6f8a4b6370576e3a009e959bc98ee03c16 Mon Sep 17 00:00:00 2001
From: Ryan Sullivan <rysulliv@redhat.com>
Date: Mon, 16 Oct 2023 14:08:36 -0400
Subject: [KPATCH CVE-2023-3611] kpatch fixes for CVE-2023-3611

Kernels:
3.10.0-1160.90.1.el7
3.10.0-1160.92.1.el7
3.10.0-1160.95.1.el7
3.10.0-1160.99.1.el7
3.10.0-1160.102.1.el7


Kpatch-MR: https://gitlab.com/redhat/prdsc/rhel/src/kpatch/rhel-7/-/merge_requests/60
Approved-by: Joe Lawrence (@joe.lawrence)
Approved-by: Yannick Cote (@ycote1)
Changes since last build:
arches: x86_64 ppc64le
cls_fw.o: changed function: fw_change
cls_fw.o: changed function: fw_set_parms
cls_route.o: changed function: route4_change
cls_u32.o: changed function: u32_change
sch_qfq.o: changed function: qfq_enqueue
---------------------------

Modifications: none

commit 726e9f3d88c729cdae09768c94e588deebdb9d52
Author: Marcelo Tosatti <mtosatti@redhat.com>
Date:   Mon Jan 23 17:17:17 2023 -0300

    KVM: x86: rename argument to kvm_set_tsc_khz

    commit 4941b8cb3746f09bb102f7a5d64d878e96a0c6cd
    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2152838
    JIRA: https://issues.redhat.com/browse/RHELPLAN-141963
    Testing: Tested by QE

    This refers to the desired (scaled) frequency, which is called
    user_tsc_khz in the rest of the file.

    Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

commit 866faa0e99083ee93d04d3c37065cf8dbfc51a34
Author: Marcelo Tosatti <mtosatti@redhat.com>
Date:   Mon Jan 23 17:24:19 2023 -0300

    KVM: x86: rewrite handling of scaled TSC for kvmclock

    commit 78db6a5037965429c04d708281f35a6e5562d31b
    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2152838
    Testing: Tested by QE
    JIRA: https://issues.redhat.com/browse/RHELPLAN-141963

    This is the same as before:

        kvm_scale_tsc(tgt_tsc_khz)
            = tgt_tsc_khz * ratio
            = tgt_tsc_khz * user_tsc_khz / tsc_khz   (see set_tsc_khz)
            = user_tsc_khz                           (see kvm_guest_time_update)
            = vcpu->arch.virtual_tsc_khz             (see kvm_set_tsc_khz)

    However, computing it through kvm_scale_tsc will make it possible
    to include the NTP correction in tgt_tsc_khz.

    Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

commit bde6eebb5708ecd38db0023e657d38058e0d962f
Author: Marcelo Tosatti <mtosatti@redhat.com>
Date:   Wed Jan 25 16:07:18 2023 -0300

    KVM: x86: add bit to indicate correct tsc_shift

    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2152838
    Testing: Tested by QE
    Upstream Status: RHEL7 only
    JIRA: https://issues.redhat.com/browse/RHELPLAN-141963

    This changeset is unique to RHEL-7 since it was decided
    it is not necessary upstream:

    "I don't think it's justifiable to further complicate the userspace API for a
    bug that's been fixed six years ago.  I'd be very surprised if any combination
    of modern upstream {QEMU,kernel} is going to do a successful migration from
    such an old {QEMU,kernel}.  RHEL/CentOS are able to do so because *specific
    pairs* have been tested, but as far as upstream is concerned this adds
    complexity that absolutely no one will use."

    Before commit 78db6a5037965429c04d708281f35a6e5562d31b,
    kvm_guest_time_update() would use vcpu->virtual_tsc_khz to calculate
    tsc_shift value in the vcpus pvclock structure written to guest memory.

    For those kernels, if vcpu->virtual_tsc_khz != tsc_khz (which can be the
    case when guest state is restored via migration, or if tsc-khz option is
    passed to QEMU), and TSC scaling is not enabled (which happens if the
    difference between the frequency requested via KVM_SET_TSC_KHZ and the
    host TSC KHZ is smaller than 250ppm), then there can be a difference
    between what KVM_GET_CLOCK would return and what the guest reads as
    kvmclock value.

    When KVM_SET_CLOCK'ing what is read with KVM_GET_CLOCK, the
    guest can observe a forward or backwards time jump.

    Advertise to userspace that current kernel contains
    this fix, so QEMU can workaround the problem by reading
    pvclock via guest memory directly otherwise.

    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

commit 9dbd3713d82f45c9781f2dc6dd49dc3ee07ba980
Author: Davide Caratti <dcaratti@redhat.com>
Date:   Tue Aug 8 12:55:43 2023 +0200

    net/sched: sch_qfq: account for stab overhead in qfq_enqueue

    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2225555
    CVE: CVE-2023-3611
    Upstream Status: net.git commit 3e337087c3b5
    Conflicts:
     - we don't have QFQ_MAX_LMAX defined in rhel-7 because of
       missing upstream commit 25369891fcef ("net/sched: sch_qfq:
       refactor parsing of netlink parameters"): use its value in
       the test inside qfq_change_agg()

    commit 3e337087c3b5805fe0b8a46ba622a962880b5d64
    Author: Pedro Tammela <pctammela@mojatatu.com>
    Date:   Tue Jul 11 18:01:02 2023 -0300

        net/sched: sch_qfq: account for stab overhead in qfq_enqueue

        Lion says:
        -------
        In the QFQ scheduler a similar issue to CVE-2023-31436
        persists.

        Consider the following code in net/sched/sch_qfq.c:

        static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch,
                        struct sk_buff **to_free)
        {
             unsigned int len = qdisc_pkt_len(skb), gso_segs;

            // ...

             if (unlikely(cl->agg->lmax < len)) {
                 pr_debug("qfq: increasing maxpkt from %u to %u for class %u",
                      cl->agg->lmax, len, cl->common.classid);
                 err = qfq_change_agg(sch, cl, cl->agg->class_weight, len);
                 if (err) {
                     cl->qstats.drops++;
                     return qdisc_drop(skb, sch, to_free);
                 }

            // ...

             }

        Similarly to CVE-2023-31436, "lmax" is increased without any bounds
        checks according to the packet length "len". Usually this would not
        impose a problem because packet sizes are naturally limited.

        This is however not the actual packet length, rather the
        "qdisc_pkt_len(skb)" which might apply size transformations according to
        "struct qdisc_size_table" as created by "qdisc_get_stab()" in
        net/sched/sch_api.c if the TCA_STAB option was set when modifying the qdisc.

        A user may choose virtually any size using such a table.

        As a result the same issue as in CVE-2023-31436 can occur, allowing heap
        out-of-bounds read / writes in the kmalloc-8192 cache.
        -------

        We can create the issue with the following commands:

        tc qdisc add dev $DEV root handle 1: stab mtu 2048 tsize 512 mpu 0 \
        overhead 999999999 linklayer ethernet qfq
        tc class add dev $DEV parent 1: classid 1:1 htb rate 6mbit burst 15k
        tc filter add dev $DEV parent 1: matchall classid 1:1
        ping -I $DEV 1.1.1.2

        This is caused by incorrectly assuming that qdisc_pkt_len() returns a
        length within the QFQ_MIN_LMAX < len < QFQ_MAX_LMAX.

        Fixes: 462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
        Reported-by: Lion <nnamrec@gmail.com>
        Reviewed-by: Eric Dumazet <edumazet@google.com>
        Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
        Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
        Reviewed-by: Simon Horman <simon.horman@corigine.com>
        Signed-off-by: Paolo Abeni <pabeni@redhat.com>

    Signed-off-by: Davide Caratti <dcaratti@redhat.com>

Signed-off-by: Ryan Sullivan <rysulliv@redhat.com>
---
 net/sched/sch_qfq.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index a36b3ec3271a..ca8c79456c80 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -387,8 +387,13 @@ static int qfq_change_agg(struct Qdisc *sch, struct qfq_class *cl, u32 weight,
 			   u32 lmax)
 {
 	struct qfq_sched *q = qdisc_priv(sch);
-	struct qfq_aggregate *new_agg = qfq_find_agg(q, lmax, weight);
+	struct qfq_aggregate *new_agg;
 
+	/* 'lmax' can range from [QFQ_MIN_LMAX, pktlen + stab overhead] */
+	if (lmax > (1UL << QFQ_MTU_SHIFT))
+		return -EINVAL;
+
+	new_agg = qfq_find_agg(q, lmax, weight);
 	if (new_agg == NULL) { /* create new aggregate */
 		new_agg = kzalloc(sizeof(*new_agg), GFP_ATOMIC);
 		if (new_agg == NULL)
-- 
2.41.0