Blob Blame History Raw
From 5247dc98b37bb4fea63e99a2e43372f73f6ba4b8 Mon Sep 17 00:00:00 2001
From: Alaa Hleihel <ahleihel@redhat.com>
Date: Tue, 12 May 2020 10:55:12 -0400
Subject: [PATCH 196/312] [netdrv] net/mlx5: Fix frequent ioread PCI access
 during recovery

Message-id: <20200512105530.4207-107-ahleihel@redhat.com>
Patchwork-id: 306979
Patchwork-instance: patchwork
O-Subject: [RHEL8.3 BZ 1789382 106/124] net/mlx5: Fix frequent ioread PCI access during recovery
Bugzilla: 1789382
RH-Acked-by: Tony Camuso <tcamuso@redhat.com>
RH-Acked-by: Kamal Heib <kheib@redhat.com>
RH-Acked-by: Jarod Wilson <jarod@redhat.com>

Bugzilla: http://bugzilla.redhat.com/1789382
Upstream: v5.7-rc2

commit 8c702a53bb0a79bfa203ba21ef1caba43673c5b7
Author: Moshe Shemesh <moshe@mellanox.com>
Date:   Mon Mar 30 10:21:49 2020 +0300

    net/mlx5: Fix frequent ioread PCI access during recovery

    High frequency of PCI ioread calls during recovery flow may cause the
    following trace on powerpc:

    [ 248.670288] EEH: 2100000 reads ignored for recovering device at
    location=Slot1 driver=mlx5_core pci addr=0000:01:00.1
    [ 248.670331] EEH: Might be infinite loop in mlx5_core driver
    [ 248.670361] CPU: 2 PID: 35247 Comm: kworker/u192:11 Kdump: loaded
    Tainted: G OE ------------ 4.14.0-115.14.1.el7a.ppc64le #1
    [ 248.670425] Workqueue: mlx5_health0000:01:00.1 health_recover_work
    [mlx5_core]
    [ 248.670471] Call Trace:
    [ 248.670492] [c00020391c11b960] [c000000000c217ac] dump_stack+0xb0/0xf4
    (unreliable)
    [ 248.670548] [c00020391c11b9a0] [c000000000045818]
    eeh_check_failure+0x5c8/0x630
    [ 248.670631] [c00020391c11ba50] [c00000000068fce4]
    ioread32be+0x114/0x1c0
    [ 248.670692] [c00020391c11bac0] [c00800000dd8b400]
    mlx5_error_sw_reset+0x160/0x510 [mlx5_core]
    [ 248.670752] [c00020391c11bb60] [c00800000dd75824]
    mlx5_disable_device+0x34/0x1d0 [mlx5_core]
    [ 248.670822] [c00020391c11bbe0] [c00800000dd8affc]
    health_recover_work+0x11c/0x3c0 [mlx5_core]
    [ 248.670891] [c00020391c11bc80] [c000000000164fcc]
    process_one_work+0x1bc/0x5f0
    [ 248.670955] [c00020391c11bd20] [c000000000167f8c]
    worker_thread+0xac/0x6b0
    [ 248.671015] [c00020391c11bdc0] [c000000000171618] kthread+0x168/0x1b0
    [ 248.671067] [c00020391c11be30] [c00000000000b65c]
    ret_from_kernel_thread+0x5c/0x80

    Reduce the PCI ioread frequency during recovery by using msleep()
    instead of cond_resched()

    Fixes: 3e5b72ac2f29 ("net/mlx5: Issue SW reset on FW assert")
    Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
    Reviewed-by: Feras Daoud <ferasda@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

Signed-off-by: Alaa Hleihel <ahleihel@redhat.com>
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/health.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index d6b0a4ef9daf..fd985c3b4147 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -243,7 +243,7 @@ void mlx5_error_sw_reset(struct mlx5_core_dev *dev)
 		if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED)
 			break;
 
-		cond_resched();
+		msleep(20);
 	} while (!time_after(jiffies, end));
 
 	if (mlx5_get_nic_state(dev) != MLX5_NIC_IFC_DISABLED) {
-- 
2.13.6