This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[参考译文] AM625:有关 CPU 上 tidss RCU_PREMPTE 自检测失速的问题

Guru**** 1832870 points
Other Parts Discussed in Thread: AM625, TFP410
请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1394222/am625-issue-about-tidss-rcu_preempt-self-detected-stall-on-cpu

器件型号:AM625
主题中讨论的其他器件: TFP410

工具与软件:

您好、TI 专家:


我们使用的是 AM625定制电路板、并通过 VGA 连接至 LCD 监视器。

我们得到一个错误日志、其中包含"RCU_PREMPTE self-detected stall on CPU"、并且系统挂起。

 

[   34.281580] rcu: INFO: rcu_preempt self-detected stall on CPU
[   34.281611] rcu: 	0-....: (2 GPs behind) idle=9104/1/0x4000000000000000 softirq=0/0 fqs=4819 rcuc=21107 jiffies(starved)
[   34.281624] 	(t=21000 jiffies g=6881 q=1364 ncpus=4)
[   34.281637] CPU: 0 PID: 137 Comm: irq/289-tidss Tainted: G           O       6.1.46-rt13-BSP_12.4--g17da321871 #1
[   34.281643] Hardware name: Texas Instruments AM625 SK (DT)
[   34.281648] pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   34.281654] pc : _raw_spin_unlock_irq+0x18/0x60
[   34.281669] lr : irq_finalize_oneshot.part.0+0x64/0x100
[   34.281687] sp : ffff000000eb3d90
[   34.281689] x29: ffff000000eb3d90 x28: ffff800008089000 x27: ffff000001ccfb10
[   34.281700] x26: ffff000001ccfadc x25: ffff800008089ee0 x24: ffff000001daee00
[   34.281708] x23: ffff000001ccfa00 x22: ffff000001ccfa60 x21: ffff000001ccfadc
[   34.281715] x20: ffff000001daee00 x19: ffff000001ccfa00 x18: ffff8000091ee000
[   34.281723] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000003c
[   34.281730] x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000000
[   34.281737] x11: ffff000001ccf680 x10: ffff8000091ee000 x9 : 0000000000000000
[   34.281744] x8 : ffff800008b594e8 x7 : 000000000000002b x6 : ffffffffffffffff
[   34.281751] x5 : ffff000001ccfa60 x4 : ffff000001ccfa60 x3 : 0000000000100000
[   34.281760] x2 : ffff800009220000 x1 : ffff0000015d6c00 x0 : 0000000100000001
[   34.281769] Call trace:
[   34.281772]  _raw_spin_unlock_irq+0x18/0x60
[   34.281777]  irq_forced_thread_fn+0x84/0xb0
[   34.281782]  irq_thread+0x12c/0x1d0
[   34.281787]  kthread+0x120/0x12c
[   34.281795]  ret_from_forklf-detected stall on CPU
[   97.284599] rcu: 	0-....: (2 GPs b+0x10/0x20
[   97.284578] rcu: INFO: rcu_preempt seehind) idle=9104/1/0x4000000000000000 softirq=0/0 fqs=19031 rcuc=84110 jiffies(starved)
[   97.284610] 	(t=84003 jiffies g=6881 q=1434 ncpus=4)
[   97.284618] CPU: 0 PID: 137 Comm: irq/289-tidss Tainted: G           O       6.1.46-rt13-BSP_12.4--g17da321871 #1
[   97.284625] Hardware name: Texas Instruments AM625 SK (DT)
[   97.284634] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   97.284639] pc : dispc_read_and_clear_irqstatus+0x58/0x1f0
[   97.284656] lr : tidss_irq_handler+0x1c/0x110
[   97.284662] sp : ffff000000eb3d50
[   97.284664] x29: ffff000000eb3d50 x28: ffff800008089000 x27: ffff000001ccfb10
[   97.284674] x26: ffff000001ccfadc x25: ffff000000070000 x24: ffff000001daee00
[   97.284681] x23: ffff000001ccfa00 x22: ffff0000015d6c00 x21: 0000000000000001
[   97.284688] x20: ffff000001ccfa00 x19: 0000000000000000 x18: ffff8000091ee000
[   97.284696] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000003c
[   97.284706] x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000000
[   97.284713] x11: ffff000001ccf680 x10: ffff8000091ee000 x9 : ffff8000091eead8
[   97.284720] x8 : 0000000000000000 x7 : ffff000001ccf680 x6 : ffffffffffffffff
[   97.284727] x5 : ffff00007f668808 x4 : 0000000000000000 x3 : ffff80007674e000
[   97.284734] x2 : ffff80000857fec0 x1 : ffff800009413000 x0 : 0000000000000000
[   97.284743] Call trace:
[   97.284745]  dispc_read_and_clear_irqstatus+0x58/0x1f0
[   97.284751]  tidss_irq_handler+0x1c/0x110
[   97.284756]  irq_forced_thread_fn+0x38/0xb0
[   97.284763]  irq_thread+0x12c/0x1d0
[   97.284767]  kthread+0x120/0x12c
[   97.284776]  ret_from_fork+0x10/0x20
[  160.287579] rcu: INFO: rcu_preempt self-detected stall on CPU
[  160.287609] rcu: 	0-....: (2 GPs behind) idle=9104/1/0x4000000000000000 softirq=0/0 fqs=32851 rcuc=147113 jiffies(starved)
[  160.287620] 	(t=147006 jiffies g=6881 q=1554 ncpus=4)
[  160.287632] CPU: 0 PID: 137 Comm: irq/289-tidss Tainted: G           O       6.1.46-rt13-BSP_12.4--g17da321871 #1
[  160.287639] Hardware name: Texas Instruments AM625 SK (DT)
[  160.287643] pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  160.287648] pc : _raw_spin_unlock_irq+0x10/0x60
[  160.287668] lr : irq_finalize_oneshot.part.0+0x64/0x100
[  160.287676] sp : ffff000000eb3d90
[  160.287678] x29: ffff000000eb3d90 x28: ffff800008089000 x27: ffff000001ccfb10
[  160.287688] x26: ffff000001ccfadc x25: ffff800008089ee0 x24: ffff000001daee00
[  160.287695] x23: ffff000001ccfa00 x22: ffff000001ccfa60 x21: ffff000001ccfadc
[  160.287703] x20: ffff000001daee00 x19: ffff000001ccfa00 x18: ffff8000091ee000
[  160.287710] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000003c
[  160.287717] x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000000
[  160.287724] x11: ffff000001ccf680 x10: ffff8000091ee000 x9 : 0000000000000000
[  160.287731] x8 : ffff800008b594e8 x7 : 000000000000002b x6 : ffffffffffffffff
[  160.287742] x5 : ffff000001ccfa60 x4 : ffff000001ccfa60 x3 : 0000000000100000
[  160.287749] x2 : ffff800009220000 x1 : 0000000000000000 x0 : 00000000000000e0
[  160.287758] Call trace:
[  160.287761]  _raw_spin_unlock_irq+0x10/0x60
[  160.287766]  irq_forced_thread_fn+0x84/0xb0
[  160.287771]  irq_thread+0x12c/0x1d0
[  160.287776]  kthread+0x120/0x12c
[  160.287785]  ret_from_fork+0x10/0x20
[  223.290579] rcu: INFO: rcu_preempt self-detected stall on CPU
[  223.290602] rcu: 	0-....: (2 GPs behind) idle=9104/1/0x4000000000000000 softirq=0/0 fqs=46701 rcuc=210116 jiffies(starved)
[  223.290612] 	(t=210009 jiffies g=6881 q=1580 ncpus=4)
[  223.290622] CPU: 0 PID: 137 Comm: irq/289-tidss Tainted: G           O       6.1.46-rt13-BSP_12.4--g17da321871 #1
[  223.290628] Hardware name: Texas Instruments AM625 SK (DT)
[  223.290632] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  223.290639] pc : dispc_read_and_clear_irqstatus+0x58/0x1f0
[  223.290654] lr : tidss_irq_handler+0x1c/0x110
[  223.290664] sp : ffff000000eb3d50
[  223.290666] x29: ffff000000eb3d50 x28: ffff800008089000 x27: ffff000001ccfb10
[  223.290676] x26: ffff000001ccfadc x25: ffff000000070000 x24: ffff000001daee00
[  223.290684] x23: ffff000001ccfa00 x22: ffff0000015d6c00 x21: 0000000000000001
[  223.290691] x20: ffff000001ccfa00 x19: 0000000000000000 x18: ffff8000091ee000
[  223.290698] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000003c
[  223.290705] x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000000
[  223.290712] x11: ffff000001ccf680 x10: ffff8000091ee000 x9 : ffff8000091eead8
[  223.290720] x8 : 0000000000000000 x7 : ffff000001ccf680 x6 : ffffffffffffffff
[  223.290727] x5 : ffff00007f668808 x4 : 0000000000000000 x3 : ffff80007674e000
[  223.290734] x2 : ffff80000857fec0 x1 : ffff800009413000 x0 : 0000000000000000
[  223.290745] Call trace:
[  223.290747]  dispc_read_and_clear_irqstatus+0x58/0x1f0
[  223.290754]  tidss_irq_handler+0x1c/0x110
[  223.290760]  irq_forced_thread_fn+0x38/0xb0
[  223.290766]  irq_thread+0x12c/0x1d0
[  223.290770]  kthread+0x120/0x12c
[  223.290778]  ret_from_fork+0x10/0x20
[  286.293578] rcu: INFO: rcu_preempt self-detected stall on CPU
[  286.293595] rcu: 	0-....: (2 GPs behind) idle=9104/1/0x4000000000000000 softirq=0/0 fqs=60579 rcuc=273119 jiffies(starved)
[  286.293606] 	(t=273012 jiffies g=6881 q=1589 ncpus=4)
[  286.293615] CPU: 0 PID: 137 Comm: irq/289-tidss Tainted: G           O       6.1.46-rt13-BSP_12.4--g17da321871 #1
[  286.293622] Hardware name: Texas Instruments AM625 SK (DT)
[  286.293627] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  286.293637] pc : dispc_k3_clear_irqstatus+0x188/0x1a0
[  286.293650] lr : dispc_read_and_clear_irqstatus+0xfc/0x1f0
[  286.293656] sp : ffff000000eb3d40
[  286.293658] x29: ffff000000eb3d40 x28: ffff800008089000 x27: ffff000001ccfb10
[  286.293667] x26: ffff000001ccfadc x25: ffff000000070000 x24: ffff000001daee00
[  286.293675] x23: ffff000001ccfa00 x22: ffff0000015d6c00 x21: 0000000000000001
[  286.293683] x20: ffff000001ccfa00 x19: 0000000000000000 x18: ffff8000091ee000
[  286.293690] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000003c
[  286.293697] x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000000
[  286.293707] x11: ffff000001ccf680 x10: ffff8000091ee000 x9 : 0000000000000000
[  286.293715] x8 : ffff800008b594e8 x7 : 000000000000002b x6 : ffffffffffffffff
[  286.293722] x5 : 0000000000000015 x4 : 00000000003fffff x3 : 0000000000000002
[  286.293729] x2 : 000000000000002c x1 : 000000000000002c x0 : 0000000000000002
[  286.293737] Call trace:
[  286.293740]  dispc_k3_clear_irqstatus+0x188/0x1a0
[  286.293747]  dispc_read_and_clear_irqstatus+0xfc/0x1f0
[  286.293753]  tidss_irq_handler+0x1c/0x110
[  286.293758]  irq_forced_thread_fn+0x38/0xb0
[  286.293764]  irq_thread+0x12c/0x1d0
[  286.293769]  kthread+0x120/0x12c
[  286.293776]  ret_from_fork+0x10/0x20

完整日志: e2e.ti.com/.../dmesg_5F00_error.txt


请帮助检查此错误是否由 Tidss 引起。


谢谢!
Allen


  • 请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。
    我们目前正计划使用 Clear IRQ 状态补丁和 Clear IRQ 无条件补丁、因为这是超可靠的。  无条件授予仅清除 IRQ 就足以消除出现无限 IRQ 风暴的可能性。

    其中一个装置检测到2个 RCU 提前、到目前为止共进行了~6200次测试。  另一个则是~5200之外的0。  但是、测试可能需要运行更长时间才能在两个单元上看到它。

  • 请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

    Jonathan、您好!

    我只是想检查结果是否有任何更新。

    另外、请注意、我们在 https://lore.kernel.org/all/20241012150710.261767-1-devarsht@ti.com/上推送了前两个补丁 、因为它们对我来说是正确的、并且有助于改善结果。 对于第三个补丁(即无条件清除 IRQ 补丁)、我会与维护人员核实是否也可以推送该补丁。

    此致

    Devarsh

  • 请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

    好的、但第三个补丁是防止出现此问题的唯一补丁。  其他两个充其量只需降低发生这种情况的机会。 因此我对提交的补丁不满意。

  • 请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

    主题所有者目前不在办公室、请预计 在本周下半年回复。

  • 请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

    在这个上面重新循环。  这是一个更新的补丁集提交,看起来很好。
    "[补丁0/7] DRM/tidss:中断修复和清理" lore.kernel.org/.../