This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VE-Q1: TDA4VE rcu_preempt self-detected stall on CPU error

Part Number: TDA4VE-Q1
Other Parts Discussed in Thread: TDA4VM, TDA4VM-Q1

Hello,

I am currently performing power cycling tests on the TDA4 platform and have encountered an issue with "rcu_preempt self-detected stall." The problem occurs after approximately 1000 power cycles, where the CPU becomes unresponsive for an extended period. The logs consistently show the "rcu_preempt self-detected stall" error.

I found that there are patches available for TDA4VM related to this issue, but I have been unable to locate a corresponding patch for TDA4 echo. Could you please provide the appropriate patch for the TDA4 echo platform?image.pngdmesg.26.txt 

  • Hello, we have received your case and the investigation will take some time. Thank you for your patience.

  • are patches available for TDA4VM

    Can you share the links? 

  • Here is the link to the patch discussion for TDA4VM-Q1 regarding the "rcu_preempt self-detected stall" issue, as referenced in the thread:  https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1493726/tda4vm-q1-patch-for-rcu_preempt-self-detected-stall-on-cpu?keyMatch=rcu%20preempt%20self-detected%20stall%20on%20CPU&tisearch=universal_search

    Notably, within the same thread, there is a mention of the TDA4 Echo platform encountering the exact same RCU stall issue. Unfortunately, the discussion did not progress to a resolved solution for the TDA4 Echo variant.

    Thank you for your assistance.

    Best regards

  • Hello.

    Those are driver fixes which apply to even for TDA4VE similar to TDA4VM. There is nothing SOC specific in the patches.
    Customer can use the same patches.

  • I have already applied the patch, but after a full day of power cycling tests, the same issue still occurred. Currently, I have no leads on how to proceed and require assistance to resolve this.

    [    5.216388] remoteproc remoteproc2: Booting fw image j721s2-main-r5f0_0-fw, size 822668
    [    5.241389] rproc-virtio rproc-virtio.8.auto: assigned reserved memory node vision-apps-r5f-dma-memory@a7000000
    [    5.262354] virtio_rpmsg_bus virtio2: rpmsg host is online
    [    5.274642] rproc-virtio rproc-virtio.8.auto: registered virtio2 (type 7)
    [    5.290606] remoteproc remoteproc2: remote processor 5c00000.r5f is now up
    [   24.678547] rcu: INFO: rcu_preempt self-detected stall on CPU
    [   24.684291] rcu:     0-....: (1 GPs behind) idle=0334/1/0x4000000000000000 softirq=2581/2591 fqs=2617
    [   24.693230]  (t=5252 jiffies g=-199 q=20152 ncpus=2)
    [   24.698184] CPU: 0 PID: 217 Comm: dbus-daemon Tainted: G           O       6.1.46+ #1
    [   24.705993] Hardware name: Texas Instruments J721S2 EVM (DT)
    [   24.711634] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [   24.718578] pc : _stext+0xa4/0x2a4
    [   24.721972] lr : _stext+0x6c/0x2a4
    [   24.725361] sp : ffff800008003f60
    [   24.728662] x29: ffff800008003f60 x28: ffff00000269d580 x27: 0000000000000002
    [   24.735782] x26: ffff00000014a478 x25: 00000000000000c0 x24: ffff8000090fd000
    [   24.742899] x23: 0000000040000005 x22: ffff800009103000 x21: ffff8000090fd9c8
    [   24.750017] x20: ffff8000092660c0 x19: ffff00000269d580 x18: 0000000000000000
    [   24.757135] x17: ffff800066b4a000 x16: ffff800008000000 x15: 0000000000000000
    [   24.764251] x14: 0000000000000231 x13: 0000000000000001 x12: 0000000000000000
    [   24.771368] x11: ffff00006fc4e340 x10: ffff80000927fe28 x9 : ffff800009269d38
    [   24.778485] x8 : 00000000d1cef000 x7 : 7fffffffffffffff x6 : 000000000ebcfa73
    [   24.785602] x5 : 03ffffffffffffff x4 : 0000000000000015 x3 : ffff00000269d580
    [   24.792720] x2 : ffff800066b4a000 x1 : 00000000000000e0 x0 : ffff800009103c80
    [   24.799837] Call trace:
    [   24.802271]  _stext+0xa4/0x2a4
    [   24.805313]  ____do_softirq+0x10/0x20
    [   24.808964]  call_on_irq_stack+0x24/0x4c
    [   24.812873]  do_softirq_own_stack+0x1c/0x30
    [   24.817041]  __irq_exit_rcu+0xcc/0xf4
    [   24.820691]  irq_exit_rcu+0x10/0x20
    [   24.824167]  el1_interrupt+0x38/0x70
    [   24.827732]  el1h_64_irq_handler+0x18/0x2c
    [   24.831815]  el1h_64_irq+0x64/0x68
    [   24.835204]  __get_obj_cgroup_from_memcg+0x5c/0x130
    [   24.840068]  get_obj_cgroup_from_current+0xc4/0x1dc
    [   24.844930]  kmem_cache_alloc_lru+0x70/0x510
    [   24.849186]  __d_alloc+0x34/0x204
    [   24.852489]  d_alloc_pseudo+0x10/0x30
    [   24.856138]  alloc_file_pseudo+0x64/0x120
    [   24.860135]  anon_inode_getfile+0x74/0xfc
    [   24.864133]  __arm64_sys_epoll_create1+0x80/0x110
    [   24.868824]  invoke_syscall+0x48/0x114
    [   24.872560]  el0_svc_common.constprop.0+0xd4/0xfc
    [   24.877250]  do_el0_svc+0x30/0xd0
    [   24.880554]  el0_svc+0x2c/0x84
    [   24.883597]  el0t_64_sync_handler+0xbc/0x140
    [   24.887853]  el0t_64_sync+0x18c/0x190
    [   87.710548] rcu: INFO: rcu_preempt self-detected stall on CPU
    [   87.716289] rcu:     0-....: (1 GPs behind) idle=0334/1/0x4000000000000000 softirq=2581/2591 fqs=10472
    [   87.725315]  (t=21011 jiffies g=-199 q=20232 ncpus=2)
    [   87.730354] CPU: 0 PID: 217 Comm: dbus-daemon Tainted: G           O       6.1.46+ #1
    [   87.738164] Hardware name: Texas Instruments J721S2 EVM (DT)
    [   87.743805] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [   87.750746] pc : _stext+0xa4/0x2a4
    [   87.754140] lr : _stext+0x6c/0x2a4
    [   87.757527] sp : ffff800008003f60
    [   87.760826] x29: ffff800008003f60 x28: ffff00000269d580 x27: 0000000000000002
    [   87.767944] x26: ffff00000014a478 x25: 00000000000000c0 x24: ffff8000090fd000
    [   87.775061] x23: 0000000040000005 x22: ffff800009103000 x21: ffff8000090fd9c8
    [   87.782178] x20: ffff8000092660c0 x19: ffff00000269d580 x18: 0000000000000000
    [   87.789295] x17: ffff800066b4a000 x16: ffff800008000000 x15: 0000000000000000
    [   87.796412] x14: 0000000000000231 x13: 0000000000000001 x12: 0000000000000000
    [   87.803529] x11: ffff00006fc4e340 x10: ffff80000927fe28 x9 : ffff800009269d38
    [   87.810646] x8 : 00000000d1cef000 x7 : 7fffffffffffffff x6 : 000000000ebcfa73
    [   87.817763] x5 : 03ffffffffffffff x4 : 0000000000000015 x3 : ffff00000269d580
    [   87.824880] x2 : ffff800066b4a000 x1 : 00000000000000e0 x0 : ffff800009103c80
    [   87.831997] Call trace:
    [   87.834431]  _stext+0xa4/0x2a4
    [   87.837472]  ____do_softirq+0x10/0x20
    [   87.841123]  call_on_irq_stack+0x24/0x4c
    [   87.845031]  do_softirq_own_stack+0x1c/0x30
    [   87.849200]  __irq_exit_rcu+0xcc/0xf4
    [   87.852850]  irq_exit_rcu+0x10/0x20
    [   87.856326]  el1_interrupt+0x38/0x70
    [   87.859891]  el1h_64_irq_handler+0x18/0x2c
    [   87.863974]  el1h_64_irq+0x64/0x68
    [   87.867361]  __get_obj_cgroup_from_memcg+0x5c/0x130
    [   87.872225]  get_obj_cgroup_from_current+0xc4/0x1dc
    [   87.877088]  kmem_cache_alloc_lru+0x70/0x510
    [   87.881343]  __d_alloc+0x34/0x204
    [   87.884647]  d_alloc_pseudo+0x10/0x30
    [   87.888295]  alloc_file_pseudo+0x64/0x120
    [   87.892291]  anon_inode_getfile+0x74/0xfc
    [   87.896290]  __arm64_sys_epoll_create1+0x80/0x110
    [   87.900981]  invoke_syscall+0x48/0x114
    [   87.904718]  el0_svc_common.constprop.0+0xd4/0xfc
    [   87.909408]  do_el0_svc+0x30/0xd0
    [   87.912709]  el0_svc+0x2c/0x84
    [   87.915752]  el0t_64_sync_handler+0xbc/0x140
    [   87.920008]  el0t_64_sync+0x18c/0x190
    [   87.923770] virtio_rpmsg_bus virtio0: creating channel rpmsg_chrdev addr 0xd
    [   87.931118] virtio_rpmsg_bus virtio1: creating channel rpmsg_chrdev addr 0xd
    [   87.938413] virtio_rpmsg_bus virtio0: creating channel rpmsg_chrdev addr 0x15
    [   87.946020] virtio_rpmsg_bus virtio2: creating channel rpmsg_chrdev addr 0xd
    [   87.953286] virtio_rpmsg_bus virtio0: creating channel rpmsg_chrdev addr 0x54
    [   87.960748] virtio_rpmsg_bus virtio1: creating channel rpmsg_chrdev addr 0x15
    [   87.968272] virtio_rpmsg_bus virtio0: creating channel ti.ipc4.ping-pong addr 0xe
    [   87.975993] virtio_rpmsg_bus virtio0: msg received with no recipient
    [   87.982385] virtio_rpmsg_bus virtio0: msg received with no recipient
    [   87.988754] virtio_rpmsg_bus virtio0: msg received with no recipient
    [   87.995174] virtio_rpmsg_bus virtio0: msg received with no recipient
    [   88.001541] virtio_rpmsg_bus virtio0: msg received with no recipient
    [   88.007893] virtio_rpmsg_bus virtio0: msg received with no recipient
    [   88.014253] virtio_rpmsg_bus virtio0: msg received with no recipient
    [   88.020635] virtio_rpmsg_bus virtio0: msg received with no recipient
    [   88.027002] virtio_rpmsg_bus virtio0: msg received with no recipient
    [   88.033364] virtio_rpmsg_bus virtio0: msg received with no recipient
    [   88.040925] virtio_rpmsg_bus virtio2: creating channel rpmsg_chrdev addr 0x15
    [   88.048231] virtio_rpmsg_bus virtio1: creating channel rpmsg_chrdev addr 0x55
    [   88.055927] virtio_rpmsg_bus virtio2: creating channel rpmsg_chrdev addr 0x4e
    [   88.063224] virtio_rpmsg_bus virtio1: creating channel ti.ipc4.ping-pong addr 0xe
    [   88.071960] 8021q: 802.1Q VLAN Support v1.8
    [   88.076397] virtio_rpmsg_bus virtio2: creating channel ti.ipc4.ping-pong addr 0xe
    start /usr/bin/start-dra ...
    [   88.198044] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Quota mode: none.
    [   88.211909] am65-cpsw-nuss 46000000.ethernet eth0: PHY [46000f00.mdio:04] driver [JLSemi JL31xx T1 PHY] (irq=POLL)
    [   88.225592] am65-cpsw-nuss 46000000.ethernet: Adding vlan 4 to vlan filter
    [   88.233376] am65-cpsw-nuss 46000000.ethernet: get: wrong ale fld id 2
    [   88.240631] am65-cpsw-nuss 46000000.ethernet: get: wrong ale fld id 1
    [   88.240746] omap_rng 4e10000.rng: Random Number Generator ver. 241b34c
    [   88.248431] am65-cpsw-nuss 46000000.ethernet: Adding vlan 6 to vlan filter
    [   88.260820] am65-cpsw-nuss 46000000.ethernet: get: wrong ale fld id 2
    [   88.267280] am65-cpsw-nuss 46000000.ethernet: get: wrong ale fld id 1
    [   88.273883] am65-cpsw-nuss 46000000.ethernet: Adding vlan 8 to vlan filter
    [   88.280862] am65-cpsw-nuss 46000000.ethernet: get: wrong ale fld id 2
    [   88.287329] am65-cpsw-nuss 46000000.ethernet: get: wrong ale fld id 1
    [   88.293877] am65-cpsw-nuss 46000000.ethernet: Adding vlan 9 to vlan filter
    [   88.300857] am65-cpsw-nuss 46000000.ethernet: get: wrong ale fld id 2
    [   88.307310] am65-cpsw-nuss 46000000.ethernet: get: wrong ale fld id 1
    [   88.313856] am65-cpsw-nuss 46000000.ethernet eth0: configuring for phy/rgmii-rxid link mode
    [   88.323232] 8021q: adding VLAN 0 to HW filter on device eth0
    [   89.008473] EXT4-fs (mmcblk0p8): recovery complete
    [   89.013327] EXT4-fs (mmcblk0p8): mounted filesystem with ordered data mode. Quota mode: none.
    [   89.052138] EXT4-fs (mmcblk0p9): mounted filesystem with ordered data mode. Quota mode: none.
    [   89.103317] EXT4-fs (mmcblk0p10): recovery complete
    [   89.108764] EXT4-fs (mmcblk0p10): mounted filesystem with ordered data mode. Quota mode: none.
    [   89.136339] emmc region is locked
    net.core.wmem_max = 4194304
    net.core.wmem_default = 1048576
    start_idrive ...
    start_idrive finish !!!
    [   89.415630] am65-cpsw-nuss 46000000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off
    [   89.424719] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
    [   90.582195] EXT4-fs (mmcblk0p9): re-mounted. Quota mode: none.
    [   90.700582] EXT4-fs (mmcblk0p9): re-mounted. Quota mode: none.
    [   90.717377] EXT4-fs (mmcblk0p2): re-mounted. Quota mode: none.
    [   90.720551] EXT4-fs (mmcblk0p9): re-mounted. Quota mode: none.
    [   90.762864] EXT4-fs (mmcblk0p2): re-mounted. Quota mode: none.
    [   90.881493] EXT4-fs (mmcblk0p9): re-mounted. Quota mode: none.
    [   90.897253] EXT4-fs (mmcblk0p2): re-mounted. Quota mode: none.
    [   90.952163] EXT4-fs (mmcblk0p2): re-mounted. Quota mode: none.
    [   91.010383] PVR_K:  678: RGX Firmware image 'rgx.fw.36.53.104.796' loaded
    [   91.031694] PVR_K:  678: Shader binary image 'rgx.sh.36.53.104.796' loaded
    [   91.371907] systemd-journald[155]: Oldest entry in /run/log/journal/fedeecf1c73a40ddb1e2a625c487722d/system.journal is older than the configured file retention duration (1month), suggesting rotation.
    [   91.394755] systemd-journald[155]: /run/log/journal/fedeecf1c73a40ddb1e2a625c487722d/system.journal: Journal header limits reached or header out-of-date, rotating.

  • Hello,

    Is this reproducible on the TI EVM and if yes what is the SDK version?