Blame SOURCES/net-packet-make-tp_drops-atomic.patch

2fdd91
From 21d5e92b26cc9b90b522ef7dd03e5cf09167f1cc Mon Sep 17 00:00:00 2001
2fdd91
From: Artem Savkov <asavkov@redhat.com>
2fdd91
Date: Tue, 22 Sep 2020 15:48:56 +0200
2fdd91
Subject: [RHEL8.2 KPATCH v2] [net] packet: make tp_drops atomic
2fdd91
2fdd91
Kernels:
2fdd91
4.18.0-193.el8
2fdd91
4.18.0-193.1.2.el8_2
2fdd91
4.18.0-193.6.3.el8_2
2fdd91
4.18.0-193.13.2.el8_2
2fdd91
4.18.0-193.14.3.el8_2
2fdd91
4.18.0-193.19.1.el8_2
2fdd91
2fdd91
Changes since last build:
2fdd91
[x86_64]:
2fdd91
af_packet.o: changed function: packet_create
2fdd91
af_packet.o: changed function: packet_getsockopt
2fdd91
af_packet.o: changed function: packet_rcv
2fdd91
af_packet.o: changed function: packet_sock_destruct
2fdd91
af_packet.o: changed function: prb_retire_current_block
2fdd91
af_packet.o: changed function: tpacket_rcv
2fdd91
2fdd91
[ppc64le]:
2fdd91
af_packet.o: changed function: packet_create
2fdd91
af_packet.o: changed function: packet_getsockopt
2fdd91
af_packet.o: changed function: packet_rcv
2fdd91
af_packet.o: changed function: packet_sock_destruct
2fdd91
af_packet.o: changed function: prb_retire_current_block
2fdd91
af_packet.o: changed function: run_filter
2fdd91
af_packet.o: changed function: tpacket_rcv
2fdd91
2fdd91
---------------------------
2fdd91
2fdd91
Modifications:
2fdd91
 - bpf calls altered to avoid issues with jump labels
2fdd91
 - tp_drops as shadow variable
2fdd91
2fdd91
Testing: reproducer from bz
2fdd91
2fdd91
commit 1513be1efa2a836cb0f4309fcf1956df3faad34c
2fdd91
Author: Hangbin Liu <haliu@redhat.com>
2fdd91
Date:   Fri Sep 11 04:19:13 2020 -0400
2fdd91
2fdd91
    [net] packet: fix overflow in tpacket_rcv
2fdd91
2fdd91
    Message-id: <20200911041913.2808606-3-haliu@redhat.com>
2fdd91
    Patchwork-id: 326146
2fdd91
    Patchwork-instance: patchwork
2fdd91
    O-Subject: [CVE-2020-14386 RHEL8.3 net PATCH 2/2] net/packet: fix overflow in tpacket_rcv
2fdd91
    Bugzilla: 1876224
2fdd91
    Z-Bugzilla: 1876223
2fdd91
    CVE: CVE-2020-14386
2fdd91
    RH-Acked-by: Davide Caratti <dcaratti@redhat.com>
2fdd91
    RH-Acked-by: Marcelo Leitner <mleitner@redhat.com>
2fdd91
    RH-Acked-by: Jarod Wilson <jarod@redhat.com>
2fdd91
    RH-Acked-by: Paolo Abeni <pabeni@redhat.com>
2fdd91
    RH-Acked-by: Ivan Vecera <ivecera@redhat.com>
2fdd91
2fdd91
    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1876224
2fdd91
    Brew: https://brewweb.devel.redhat.com/taskinfo?taskID=31276277
2fdd91
    Upstream Status: net.git commit acf69c946233
2fdd91
    CVE: CVE-2020-14386
2fdd91
2fdd91
    commit acf69c946233259ab4d64f8869d4037a198c7f06
2fdd91
    Author: Or Cohen <orcohen@paloaltonetworks.com>
2fdd91
    Date:   Thu Sep 3 21:05:28 2020 -0700
2fdd91
2fdd91
        net/packet: fix overflow in tpacket_rcv
2fdd91
2fdd91
        Using tp_reserve to calculate netoff can overflow as
2fdd91
        tp_reserve is unsigned int and netoff is unsigned short.
2fdd91
2fdd91
        This may lead to macoff receving a smaller value then
2fdd91
        sizeof(struct virtio_net_hdr), and if po->has_vnet_hdr
2fdd91
        is set, an out-of-bounds write will occur when
2fdd91
        calling virtio_net_hdr_from_skb.
2fdd91
2fdd91
        The bug is fixed by converting netoff to unsigned int
2fdd91
        and checking if it exceeds USHRT_MAX.
2fdd91
2fdd91
        This addresses CVE-2020-14386
2fdd91
2fdd91
        Fixes: 8913336a7e8d ("packet: add PACKET_RESERVE sockopt")
2fdd91
        Signed-off-by: Or Cohen <orcohen@paloaltonetworks.com>
2fdd91
        Signed-off-by: Eric Dumazet <edumazet@google.com>
2fdd91
        Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2fdd91
2fdd91
    Signed-off-by: Hangbin Liu <haliu@redhat.com>
2fdd91
    Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
2fdd91
    Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
2fdd91
2fdd91
commit 5d07c2093eec0b75b60f6087a6c1b1f79c46e20c
2fdd91
Author: Hangbin Liu <haliu@redhat.com>
2fdd91
Date:   Fri Sep 11 04:19:12 2020 -0400
2fdd91
2fdd91
    [net] packet: make tp_drops atomic
2fdd91
2fdd91
    Message-id: <20200911041913.2808606-2-haliu@redhat.com>
2fdd91
    Patchwork-id: 326145
2fdd91
    Patchwork-instance: patchwork
2fdd91
    O-Subject: [CVE-2020-14386 RHEL8.3 net PATCH 1/2] net/packet: make tp_drops atomic
2fdd91
    Bugzilla: 1876224
2fdd91
    Z-Bugzilla: 1876223
2fdd91
    CVE: CVE-2020-14386
2fdd91
    RH-Acked-by: Davide Caratti <dcaratti@redhat.com>
2fdd91
    RH-Acked-by: Marcelo Leitner <mleitner@redhat.com>
2fdd91
    RH-Acked-by: Jarod Wilson <jarod@redhat.com>
2fdd91
    RH-Acked-by: Paolo Abeni <pabeni@redhat.com>
2fdd91
    RH-Acked-by: Ivan Vecera <ivecera@redhat.com>
2fdd91
2fdd91
    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1876224
2fdd91
    Brew: https://brewweb.devel.redhat.com/taskinfo?taskID=31276277
2fdd91
    Upstream Status: net.git commit 8e8e2951e309
2fdd91
2fdd91
    commit 8e8e2951e3095732d7e780c241f61ea130955a57
2fdd91
    Author: Eric Dumazet <edumazet@google.com>
2fdd91
    Date:   Wed Jun 12 09:52:30 2019 -0700
2fdd91
2fdd91
        net/packet: make tp_drops atomic
2fdd91
2fdd91
        Under DDOS, we want to be able to increment tp_drops without
2fdd91
        touching the spinlock. This will help readers to drain
2fdd91
        the receive queue slightly faster :/
2fdd91
2fdd91
        Signed-off-by: Eric Dumazet <edumazet@google.com>
2fdd91
        Signed-off-by: David S. Miller <davem@davemloft.net>
2fdd91
2fdd91
    Signed-off-by: Hangbin Liu <haliu@redhat.com>
2fdd91
    Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
2fdd91
    Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
2fdd91
2fdd91
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
2fdd91
Acked-by: Yannick Cote <ycote@redhat.com>
2fdd91
Signed-off-by: Artem Savkov <asavkov@redhat.com>
2fdd91
2fdd91
---
2fdd91
 net/packet/af_packet.c | 118 ++++++++++++++++++++++++++++++++++++-----
2fdd91
 1 file changed, 106 insertions(+), 12 deletions(-)
2fdd91
2fdd91
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
2fdd91
index d69fb2077196..4c67f7156a17 100644
2fdd91
--- a/net/packet/af_packet.c
2fdd91
+++ b/net/packet/af_packet.c
2fdd91
@@ -185,6 +185,8 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u,
2fdd91
 #define BLOCK_O2PRIV(x)	((x)->offset_to_priv)
2fdd91
 #define BLOCK_PRIV(x)		((void *)((char *)(x) + BLOCK_O2PRIV(x)))
2fdd91
 
2fdd91
+#define KLP_SHADOW_TP_DROPS 0x2020143860000000
2fdd91
+
2fdd91
 struct packet_sock;
2fdd91
 static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
2fdd91
 		       struct packet_type *pt, struct net_device *orig_dev);
2fdd91
@@ -747,6 +749,8 @@ static void prb_flush_block(struct tpacket_kbdq_core *pkc1,
2fdd91
 #endif
2fdd91
 }
2fdd91
 
2fdd91
+#include "kpatch-macros.h"
2fdd91
+
2fdd91
 /*
2fdd91
  * Side effect:
2fdd91
  *
2fdd91
@@ -765,8 +769,9 @@ static void prb_close_block(struct tpacket_kbdq_core *pkc1,
2fdd91
 	struct tpacket3_hdr *last_pkt;
2fdd91
 	struct tpacket_hdr_v1 *h1 = &pbd1->hdr.bh1;
2fdd91
 	struct sock *sk = &po->sk;
2fdd91
+	atomic_t *tp_drops = klp_shadow_get(po, KLP_SHADOW_TP_DROPS);
2fdd91
 
2fdd91
-	if (po->stats.stats3.tp_drops)
2fdd91
+	if (tp_drops && atomic_read(tp_drops))
2fdd91
 		status |= TP_STATUS_LOSING;
2fdd91
 
2fdd91
 	last_pkt = (struct tpacket3_hdr *)pkc1->prev;
2fdd91
@@ -1281,6 +1286,8 @@ static int packet_rcv_has_room(struct packet_sock *po, struct sk_buff *skb)
2fdd91
 
2fdd91
 static void packet_sock_destruct(struct sock *sk)
2fdd91
 {
2fdd91
+	struct packet_sock *po = pkt_sk(sk);
2fdd91
+
2fdd91
 	skb_queue_purge(&sk->sk_error_queue);
2fdd91
 
2fdd91
 	WARN_ON(atomic_read(&sk->sk_rmem_alloc));
2fdd91
@@ -1291,6 +1298,8 @@ static void packet_sock_destruct(struct sock *sk)
2fdd91
 		return;
2fdd91
 	}
2fdd91
 
2fdd91
+	klp_shadow_free(po, KLP_SHADOW_TP_DROPS, NULL);
2fdd91
+
2fdd91
 	sk_refcnt_debug_dec(sk);
2fdd91
 }
2fdd91
 
2fdd91
@@ -1994,6 +2003,38 @@ static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg,
2fdd91
 	return err;
2fdd91
 }
2fdd91
 
2fdd91
+#define BPF_PROG_RUN_KPATCH(prog, ctx)	({				\
2fdd91
+	u32 ret;						\
2fdd91
+	cant_sleep();						\
2fdd91
+	if (static_key_enabled(&bpf_stats_enabled_key)) {	\
2fdd91
+		struct bpf_prog_stats *stats;			\
2fdd91
+		u64 start = sched_clock();			\
2fdd91
+		ret = (*(prog)->bpf_func)(ctx, (prog)->insnsi);	\
2fdd91
+		stats = this_cpu_ptr(prog->aux->stats);		\
2fdd91
+		u64_stats_update_begin(&stats->syncp);		\
2fdd91
+		stats->cnt++;					\
2fdd91
+		stats->nsecs += sched_clock() - start;		\
2fdd91
+		u64_stats_update_end(&stats->syncp);		\
2fdd91
+	} else {						\
2fdd91
+		ret = (*(prog)->bpf_func)(ctx, (prog)->insnsi);	\
2fdd91
+	}							\
2fdd91
+	ret; })
2fdd91
+
2fdd91
+static inline u32 bpf_prog_run_clear_cb_kpatch(const struct bpf_prog *prog,
2fdd91
+					struct sk_buff *skb)
2fdd91
+{
2fdd91
+	u8 *cb_data = bpf_skb_cb(skb);
2fdd91
+	u32 res;
2fdd91
+
2fdd91
+	if (unlikely(prog->cb_access))
2fdd91
+		memset(cb_data, 0, BPF_SKB_CB_LEN);
2fdd91
+
2fdd91
+	preempt_disable();
2fdd91
+	res = BPF_PROG_RUN_KPATCH(prog, skb);
2fdd91
+	preempt_enable();
2fdd91
+	return res;
2fdd91
+}
2fdd91
+
2fdd91
 static unsigned int run_filter(struct sk_buff *skb,
2fdd91
 			       const struct sock *sk,
2fdd91
 			       unsigned int res)
2fdd91
@@ -2003,7 +2044,7 @@ static unsigned int run_filter(struct sk_buff *skb,
2fdd91
 	rcu_read_lock();
2fdd91
 	filter = rcu_dereference(sk->sk_filter);
2fdd91
 	if (filter != NULL)
2fdd91
-		res = bpf_prog_run_clear_cb(filter->prog, skb);
2fdd91
+		res = bpf_prog_run_clear_cb_kpatch(filter->prog, skb);
2fdd91
 	rcu_read_unlock();
2fdd91
 
2fdd91
 	return res;
2fdd91
@@ -2046,6 +2087,7 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev,
2fdd91
 	int skb_len = skb->len;
2fdd91
 	unsigned int snaplen, res;
2fdd91
 	bool is_drop_n_account = false;
2fdd91
+	atomic_t *tp_drops;
2fdd91
 
2fdd91
 	if (skb->pkt_type == PACKET_LOOPBACK)
2fdd91
 		goto drop;
2fdd91
@@ -2053,6 +2095,17 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev,
2fdd91
 	sk = pt->af_packet_priv;
2fdd91
 	po = pkt_sk(sk);
2fdd91
 
2fdd91
+	tp_drops = klp_shadow_get(po, KLP_SHADOW_TP_DROPS);
2fdd91
+	if (!tp_drops) {
2fdd91
+		tp_drops = klp_shadow_alloc(po, KLP_SHADOW_TP_DROPS,
2fdd91
+					    sizeof(atomic_t*), GFP_ATOMIC,
2fdd91
+					    NULL, NULL);
2fdd91
+		if (!tp_drops)
2fdd91
+			goto drop;
2fdd91
+
2fdd91
+		atomic_set(tp_drops, po->stats.stats1.tp_drops);
2fdd91
+	}
2fdd91
+
2fdd91
 	if (!net_eq(dev_net(dev), sock_net(sk)))
2fdd91
 		goto drop;
2fdd91
 
2fdd91
@@ -2135,10 +2188,8 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev,
2fdd91
 
2fdd91
 drop_n_acct:
2fdd91
 	is_drop_n_account = true;
2fdd91
-	spin_lock(&sk->sk_receive_queue.lock);
2fdd91
-	po->stats.stats1.tp_drops++;
2fdd91
+	atomic_inc(tp_drops);
2fdd91
 	atomic_inc(&sk->sk_drops);
2fdd91
-	spin_unlock(&sk->sk_receive_queue.lock);
2fdd91
 
2fdd91
 drop_n_restore:
2fdd91
 	if (skb_head != skb->data && skb_shared(skb)) {
2fdd91
@@ -2164,12 +2215,14 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
2fdd91
 	int skb_len = skb->len;
2fdd91
 	unsigned int snaplen, res;
2fdd91
 	unsigned long status = TP_STATUS_USER;
2fdd91
-	unsigned short macoff, netoff, hdrlen;
2fdd91
+	unsigned short macoff, hdrlen;
2fdd91
+	unsigned int netoff;
2fdd91
 	struct sk_buff *copy_skb = NULL;
2fdd91
 	struct timespec ts;
2fdd91
 	__u32 ts_status;
2fdd91
 	bool is_drop_n_account = false;
2fdd91
 	bool do_vnet = false;
2fdd91
+	atomic_t *tp_drops;
2fdd91
 
2fdd91
 	/* struct tpacket{2,3}_hdr is aligned to a multiple of TPACKET_ALIGNMENT.
2fdd91
 	 * We may add members to them until current aligned size without forcing
2fdd91
@@ -2184,6 +2237,17 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
2fdd91
 	sk = pt->af_packet_priv;
2fdd91
 	po = pkt_sk(sk);
2fdd91
 
2fdd91
+	tp_drops = klp_shadow_get(po, KLP_SHADOW_TP_DROPS);
2fdd91
+	if (!tp_drops) {
2fdd91
+		tp_drops = klp_shadow_alloc(po, KLP_SHADOW_TP_DROPS,
2fdd91
+					    sizeof(atomic_t*), GFP_ATOMIC,
2fdd91
+					    NULL, NULL);
2fdd91
+		if (!tp_drops)
2fdd91
+			goto drop;
2fdd91
+
2fdd91
+		atomic_set(tp_drops, po->stats.stats1.tp_drops);
2fdd91
+	}
2fdd91
+
2fdd91
 	if (!net_eq(dev_net(dev), sock_net(sk)))
2fdd91
 		goto drop;
2fdd91
 
2fdd91
@@ -2226,6 +2290,10 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
2fdd91
 		}
2fdd91
 		macoff = netoff - maclen;
2fdd91
 	}
2fdd91
+	if (netoff > USHRT_MAX) {
2fdd91
+		atomic_inc(tp_drops);
2fdd91
+		goto drop_n_restore;
2fdd91
+	}
2fdd91
 	if (po->tp_version <= TPACKET_V2) {
2fdd91
 		if (macoff + snaplen > po->rx_ring.frame_size) {
2fdd91
 			if (po->copy_thresh &&
2fdd91
@@ -2272,7 +2340,7 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
2fdd91
 	 * Anyways, moving it for V1/V2 only as V3 doesn't need this
2fdd91
 	 * at packet level.
2fdd91
 	 */
2fdd91
-		if (po->stats.stats1.tp_drops)
2fdd91
+		if (atomic_read(tp_drops))
2fdd91
 			status |= TP_STATUS_LOSING;
2fdd91
 	}
2fdd91
 
2fdd91
@@ -2388,9 +2456,9 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
2fdd91
 	return 0;
2fdd91
 
2fdd91
 drop_n_account:
2fdd91
-	is_drop_n_account = true;
2fdd91
-	po->stats.stats1.tp_drops++;
2fdd91
 	spin_unlock(&sk->sk_receive_queue.lock);
2fdd91
+	atomic_inc(tp_drops);
2fdd91
+	is_drop_n_account = true;
2fdd91
 
2fdd91
 	sk->sk_data_ready(sk);
2fdd91
 	kfree_skb(copy_skb);
2fdd91
@@ -3195,6 +3263,7 @@ static int packet_create(struct net *net, struct socket *sock, int protocol,
2fdd91
 	struct sock *sk;
2fdd91
 	struct packet_sock *po;
2fdd91
 	__be16 proto = (__force __be16)protocol; /* weird, but documented */
2fdd91
+	atomic_t *tp_drops;
2fdd91
 	int err;
2fdd91
 
2fdd91
 	if (!ns_capable(net->user_ns, CAP_NET_RAW))
2fdd91
@@ -3221,9 +3290,16 @@ static int packet_create(struct net *net, struct socket *sock, int protocol,
2fdd91
 	po->num = proto;
2fdd91
 	po->xmit = dev_queue_xmit;
2fdd91
 
2fdd91
+	tp_drops = klp_shadow_get_or_alloc(po, KLP_SHADOW_TP_DROPS,
2fdd91
+					   sizeof(atomic_t*), GFP_KERNEL,
2fdd91
+					   NULL, NULL);
2fdd91
+
2fdd91
+	if (!tp_drops)
2fdd91
+		goto out2;
2fdd91
+
2fdd91
 	err = packet_alloc_pending(po);
2fdd91
 	if (err)
2fdd91
-		goto out2;
2fdd91
+		goto out3;
2fdd91
 
2fdd91
 	packet_cached_dev_reset(po);
2fdd91
 
2fdd91
@@ -3258,6 +3334,8 @@ static int packet_create(struct net *net, struct socket *sock, int protocol,
2fdd91
 	preempt_enable();
2fdd91
 
2fdd91
 	return 0;
2fdd91
+out3:
2fdd91
+	klp_shadow_free(po, KLP_SHADOW_TP_DROPS, NULL);
2fdd91
 out2:
2fdd91
 	sk_free(sk);
2fdd91
 out:
2fdd91
@@ -3873,6 +3951,8 @@ static int packet_getsockopt(struct socket *sock, int level, int optname,
2fdd91
 	void *data = &val;
2fdd91
 	union tpacket_stats_u st;
2fdd91
 	struct tpacket_rollover_stats rstats;
2fdd91
+	int drops;
2fdd91
+	atomic_t *tp_drops;
2fdd91
 
2fdd91
 	if (level != SOL_PACKET)
2fdd91
 		return -ENOPROTOOPT;
2fdd91
@@ -3883,20 +3963,34 @@ static int packet_getsockopt(struct socket *sock, int level, int optname,
2fdd91
 	if (len < 0)
2fdd91
 		return -EINVAL;
2fdd91
 
2fdd91
+	tp_drops = klp_shadow_get(po, KLP_SHADOW_TP_DROPS);
2fdd91
+	if (!tp_drops) {
2fdd91
+		tp_drops = klp_shadow_alloc(po, KLP_SHADOW_TP_DROPS,
2fdd91
+					    sizeof(atomic_t*), GFP_ATOMIC,
2fdd91
+					    NULL, NULL);
2fdd91
+		if (!tp_drops)
2fdd91
+			return -ENOMEM;
2fdd91
+
2fdd91
+		atomic_set(tp_drops, po->stats.stats1.tp_drops);
2fdd91
+	}
2fdd91
+
2fdd91
 	switch (optname) {
2fdd91
 	case PACKET_STATISTICS:
2fdd91
 		spin_lock_bh(&sk->sk_receive_queue.lock);
2fdd91
 		memcpy(&st, &po->stats, sizeof(st));
2fdd91
 		memset(&po->stats, 0, sizeof(po->stats));
2fdd91
 		spin_unlock_bh(&sk->sk_receive_queue.lock);
2fdd91
+		drops = atomic_xchg(tp_drops, 0);
2fdd91
 
2fdd91
 		if (po->tp_version == TPACKET_V3) {
2fdd91
 			lv = sizeof(struct tpacket_stats_v3);
2fdd91
-			st.stats3.tp_packets += st.stats3.tp_drops;
2fdd91
+			st.stats3.tp_drops = drops;
2fdd91
+			st.stats3.tp_packets += drops;
2fdd91
 			data = &st.stats3;
2fdd91
 		} else {
2fdd91
 			lv = sizeof(struct tpacket_stats);
2fdd91
-			st.stats1.tp_packets += st.stats1.tp_drops;
2fdd91
+			st.stats1.tp_drops = drops;
2fdd91
+			st.stats1.tp_packets += drops;
2fdd91
 			data = &st.stats1;
2fdd91
 		}
2fdd91
 
2fdd91
-- 
2fdd91
2.26.2
2fdd91