870ec2 ppc64: tackle SRCU hang issue

Authored and Committed by Pingfan Liu 2 years ago
    ppc64: tackle SRCU hang issue
    
    Resolves: bz2158296
    Upstream: RHEL-only
    
    On PowerPC platform, the following hang is witnessed:
    
    Welcome to
    Red Hat Enterprise Linux 9.2 Beta (Plow) dracut-057-13.git20220816.el9 (Initramfs)
    !
    
    [    1.631210] systemd[1]: Hostname set to <ibm-p9z-18-lp11.virt.pnr.lab.eng.rdu2.redhat.com>.
    [-- MARK -- Mon Sep 26 01:45:00 2022]
    [  243.681283] INFO: task systemd:1 blocked for more than 122 seconds.
    [  243.681303]       Not tainted 5.14.0-167.el9.ppc64le #1
    [  243.681315] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  243.681329] task:systemd         state:D stack:    0 pid:    1 ppid:     0 flags:0x00042000
    [  243.681349] Call Trace:
    [  243.681356] [c00000001a603640] [c00000004f990100] 0xc00000004f990100 (unreliable)
    [  243.681378] [c00000001a603830] [c00000001001e9cc] __switch_to+0x12c/0x220
    [  243.681400] [c00000001a603890] [c000000010ec5b40] __schedule+0x230/0x720
    [  243.681418] [c00000001a603950] [c000000010ec6090] schedule+0x60/0x110
    [  243.681435] [c00000001a603980] [c000000010ecd948] schedule_timeout+0x168/0x1c0
    [  243.681454] [c00000001a603a60] [c000000010ec7214] __wait_for_common+0x134/0x360
    [  243.681473] [c00000001a603b00] [c00000001017c98c] __flush_work.isra.0+0x1dc/0x3d0
    [  243.681493] [c00000001a603ba0] [c0000000105cbd88] fsnotify_wait_marks_destroyed+0x28/0x40
    [  243.681512] [c00000001a603bc0] [c0000000105cb800] fsnotify_destroy_group+0x60/0x150
    [  243.681531] [c00000001a603c30] [c0000000105cf640] inotify_release+0x30/0xa0
    [  243.681548] [c00000001a603ca0] [c00000001054fad8] __fput+0xc8/0x350
    [  243.681565] [c00000001a603cf0] [c000000010183174] task_work_run+0xe4/0x160
    [  243.681583] [c00000001a603d40] [c000000010021874] do_notify_resume+0x134/0x140
    [  243.681602] [c00000001a603d70] [c000000010030168] interrupt_exit_user_prepare_main+0x198/0x270
    [  243.681622] [c00000001a603de0] [c0000000100305ac] syscall_exit_prepare+0x6c/0x180
    [  243.681641] [c00000001a603e10] [c00000001000bff4] system_call_vectored_common+0xf4/0x278
    [  243.681661] --- interrupt: 3000 at 0x7fffb3015ba4
    [  243.681673] NIP:  00007fffb3015ba4 LR: 0000000000000000 CTR: 0000000000000000
    [  243.681687] REGS: c00000001a603e80 TRAP: 3000   Not tainted  (5.14.0-167.el9.ppc64le)
    [  243.681703] MSR:  800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE>  CR: 42044440  XER: 00000000
    [  243.681737] IRQMASK: 0
    [  243.681737] GPR00: 0000000000000006 00007fffd24a31a0 00007fffb3127200 0000000000000000
    [  243.681737] GPR04: 0000000000000002 000000000000000a 0000000000000000 0000000000000000
    [  243.681737] GPR08: 0000010009ea2d40 0000000000000000 0000000000000000 0000000000000000
    [  243.681737] GPR12: 0000000000000000 00007fffb3834bc0 0000000000000000 0000000000000000
    [  243.681737] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    [  243.681737] GPR20: 000000012c74ddf0 000000000000000e 000000000017cd3f 0000000000000000
    [  243.681737] GPR24: 00007fffd24a3570 0000000000000005 0000010009eb5490 0000010009ea24e0
    [  243.681737] GPR28: 0000010009ea2900 0000010009eb4850 0000010009ea2d70 00007fffb382dd98
    [  243.681896] NIP [00007fffb3015ba4] 0x7fffb3015ba4
    [  243.681907] LR [0000000000000000] 0x0
    [  243.681917] --- interrupt: 3000
    [  243.681928] INFO: task kworker/u16:1:34 blocked for more than 122 seconds.
    [  243.681941]       Not tainted 5.14.0-167.el9.ppc64le #1
    [  243.681951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  243.681964] task:kworker/u16:1   state:D stack:    0 pid:   34 ppid:     2 flags:0x00000800
    [  243.681982] Workqueue: events_unbound fsnotify_mark_destroy_workfn
    [  243.681998] Call Trace:
    [  243.682005] [c00000001a9336d0] [c00000004f990100] 0xc00000004f990100 (unreliable)
    [  243.682023] [c00000001a9338c0] [c00000001001e9cc] __switch_to+0x12c/0x220
    [  243.682042] [c00000001a933920] [c000000010ec5b40] __schedule+0x230/0x720
    [  243.682059] [c00000001a9339e0] [c000000010ec6090] schedule+0x60/0x110
    [  243.682075] [c00000001a933a10] [c000000010ecd948] schedule_timeout+0x168/0x1c0
    [  243.682094] [c00000001a933af0] [c000000010ec7214] __wait_for_common+0x134/0x360
    [  243.682113] [c00000001a933b90] [c000000010213370] __synchronize_srcu.part.0+0xa0/0xe0
    [  243.682132] [c00000001a933c00] [c0000000105cc154] fsnotify_mark_destroy_workfn+0xc4/0x1a0
    [  243.682151] [c00000001a933c70] [c00000001017acb8] process_one_work+0x298/0x580
    [  243.682169] [c00000001a933d10] [c00000001017b048] worker_thread+0xa8/0x630
    [  243.682185] [c00000001a933da0] [c000000010188348] kthread+0x1b8/0x1c0
    [  243.682203] [c00000001a933e10] [c00000001000cd64] ret_from_kernel_thread+0x5c/0x64
    [  366.561279] INFO: task systemd:1 blocked for more than 245 seconds.
    
    The right solution should be in kernel, but since the patch [1] for SRCU
    will not be merged into the mainline in near future, it had better to
    have a userspace workaround to overcome this test blocker.
    
    The workaround method is to pass the kernel parameter "srcutree.big_cpu_lim=0", so
    that the SRCU system will always use srcu_node array.
    
    [1]: https://lore.kernel.org/rcu/20221026032716.78674-1-kernelfans@gmail.com/T/#m6534975507c2abca497a94d81c7abbfea1d0978d
    
    Signed-off-by: Pingfan Liu <piliu@redhat.com>
    
        
file modified
+1 -1
file modified
+1 -1