This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[参考译文] TDA4VE-Q1:自定义视觉应用会导致系统冻结

Guru**** 2379250 points
请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1503747/tda4ve-q1-the-custom-vision-application-causes-the-system-to-freeze

器件型号:TDA4VE-Q1

工具/软件:

尊敬的 TI 专家:

我们开发了一个定制视觉应用、在 Linux + RTOS 模式下、在 TDA4VE 上同时执行物体检测和环视。 (使用 SDK r10.1)

但是、系统在重负载下偶尔会冻结。

应用程序在冻结前可以正常运行几个小时。

图表:

GRAPH: app_iavm_graph (#nodes =   8, #executions = 135698)
 NODE:       CAPTURE1:             capture_node: avg =  32845 usecs, min/max =  12580 /  62151 usecs, #executions =     135698
 NODE:      VPAC_LDC1:                 ldc_node: avg =  14352 usecs, min/max =  14261 /  14774 usecs, #executions =     135698
 NODE:          MPU-0:          OpenGL_SRV_Node: avg =  11439 usecs, min/max =  10431 /  43922 usecs, #executions =     135698
 NODE:       DISPLAY1:            Display_node1: avg =   8769 usecs, min/max =     63 /  17056 usecs, #executions =     135698
 NODE:      VPAC_MSC1:              scaler_node: avg =  21950 usecs, min/max =  21755 /  22701 usecs, #executions =     135698
 NODE:       DISPLAY2:            Display_node2: avg =  15096 usecs, min/max =     72 /  16745 usecs, #executions =     135698
 NODE:          DSP-1:              PreProcNode: avg =   7129 usecs, min/max =   6491 /   7770 usecs, #executions =     135698
 NODE:       DSP_C7-1:                tidl_node: avg =  23034 usecs, min/max =  20931 /  23742 usecs, #executions =     135698

GRAPH: app_iavm_graph_gpu_lut (#nodes =   2, #executions =      1)
 NODE:          DSP-1:                 node_202: avg =    351 usecs, min/max =    351 /    351 usecs, #executions =          1
 NODE:          DSP-1:                 node_203: avg =  10805 usecs, min/max =  10805 /  10805 usecs, #executions =          1

GRAPH: app_iavm_graph_disp_ovl (#nodes =   1, #executions = 135694)
 NODE:       DISPLAY2:            Display_node2: avg =  14212 usecs, min/max =     70 /  36725 usecs, #executions =     135694

 PERF:           FILEIO: avg =      0 usecs, min/max = 4294967295 /      0 usecs, #executions =          0
 PERF:            TOTAL: avg =  33330 usecs, min/max =  31842 /  34893 usecs, #executions =       7325

 PERF:            TOTAL:   30. 0 FPS

资源负荷:  

Summary of CPU load,
====================

CPU: mpu1_0: TOTAL LOAD =  34. 5 % ( HWI =   1. 5 %, SWI =   0.26 % )
CPU: mcu2_0: TOTAL LOAD =  14. 0 % ( HWI =   0. 0 %, SWI =   0. 0 % )
CPU: mcu2_1: TOTAL LOAD =   1. 0 % ( HWI =   0. 0 %, SWI =   0. 0 % )
CPU:  c7x_1: TOTAL LOAD =  70. 0 % ( HWI =   0. 0 %, SWI =   0. 0 % )
CPU:  c7x_2: TOTAL LOAD =  22. 0 % ( HWI =   0. 0 %, SWI =   0. 0 % )


HWA performance statistics,
===========================

HWA:   LDC : LOAD =  42.58 % ( 294 MP/s )
HWA:   MSC0: LOAD =  65.25 % ( 294 MP/s )
HWA:   GPU : LOAD =  33.27 % ( 62 MP/s )


DDR performance statistics,
===========================

DDR: READ  BW: AVG =   4646 MB/s, PEAK =  25865 MB/s
DDR: WRITE BW: AVG =   2640 MB/s, PEAK =  13105 MB/s
DDR: TOTAL BW: AVG =   7286 MB/s, PEAK =  38970 MB/s


Detailed CPU performance/memory statistics,
===========================================

  4565.788684 s: DDR_SHARED_MEM: Alloc's: 152 alloc's of 309991564 bytes
  4565.788695 s: DDR_SHARED_MEM: Free's : 1 free's  of 36 bytes
  4565.788702 s: DDR_SHARED_MEM: Open's : 151 allocs  of 309991528 bytes

CPU: mcu2_0: TASK:      FREERTOS_TA:   0. 0 %
CPU: mcu2_0: TASK:           IPC_RX:   0.43 %
CPU: mcu2_0: TASK:       REMOTE_SRV:   0. 0 %
CPU: mcu2_0: TASK:        LOAD_TEST:   0. 0 %
CPU: mcu2_0: TASK:       TIVX_CPU_0:   0. 0 %
CPU: mcu2_0: TASK:        TIVX_V1NF:   0. 0 %
CPU: mcu2_0: TASK:       TIVX_V1LDC:   3.84 %
CPU: mcu2_0: TASK:      TIVX_V1MSC1:   4.99 %
CPU: mcu2_0: TASK:      TIVX_V1MSC2:   0. 0 %
CPU: mcu2_0: TASK:      TIVX_V1VISS:   0. 0 %
CPU: mcu2_0: TASK:       TIVX_CAPT1:   1.92 %
CPU: mcu2_0: TASK:       TIVX_CAPT2:   0. 0 %
CPU: mcu2_0: TASK:       TIVX_CAPT3:   0. 0 %
CPU: mcu2_0: TASK:       TIVX_CAPT4:   0. 0 %
CPU: mcu2_0: TASK:       TIVX_CAPT5:   0. 0 %
CPU: mcu2_0: TASK:       TIVX_CAPT6:   0. 0 %
CPU: mcu2_0: TASK:       TIVX_CAPT7:   0. 0 %
CPU: mcu2_0: TASK:       TIVX_CAPT8:   0. 0 %
CPU: mcu2_0: TASK:       TIVX_DISP1:   0.62 %
CPU: mcu2_0: TASK:       TIVX_DISP2:   2.13 %
CPU: mcu2_0: TASK:       TIVX_CSITX:   0. 0 %
CPU: mcu2_0: TASK:      TIVX_CSITX2:   0. 0 %
CPU: mcu2_0: TASK:      TIVX_DPM2M1:   0. 0 %
CPU: mcu2_0: TASK:      TIVX_DPM2M2:   0. 0 %
CPU: mcu2_0: TASK:      TIVX_DPM2M3:   0. 0 %
CPU: mcu2_0: TASK:      TIVX_DPM2M4:   0. 0 %
CPU: mcu2_0: TASK:      IPC_TEST_RX:   0. 0 %
CPU: mcu2_0: TASK:      IPC_TEST_TX:   0. 0 %
CPU: mcu2_0: TASK:      IPC_TEST_TX:   0. 0 %
CPU: mcu2_0: TASK:      IPC_TEST_TX:   0. 0 %
CPU: mcu2_0: TASK:      IPC_TEST_TX:   0. 0 %

CPU: mcu2_0: HEAP:    DDR_LOCAL_MEM: size =   14680064 B, free =   14591744 B ( 99 % unused)
CPU: mcu2_0: HEAP:           L3_MEM: size =     524288 B, free =     524032 B ( 99 % unused)
CPU: mcu2_0: HEAP:  DDR_CACHE_WT_ME: size =    2097152 B, free =    2096896 B ( 99 % unused)

CPU: mcu2_1: TASK:      FREERTOS_TA:   0. 0 %
CPU: mcu2_1: TASK:           IPC_RX:   0. 0 %
CPU: mcu2_1: TASK:       REMOTE_SRV:   0. 0 %
CPU: mcu2_1: TASK:        LOAD_TEST:   0. 0 %
CPU: mcu2_1: TASK:       TIVX_CPU_1:   0. 0 %
CPU: mcu2_1: TASK:         TIVX_SDE:   0. 0 %
CPU: mcu2_1: TASK:         TIVX_DOF:   0. 0 %
CPU: mcu2_1: TASK:      IPC_TEST_RX:   0. 0 %
CPU: mcu2_1: TASK:      IPC_TEST_TX:   0. 0 %
CPU: mcu2_1: TASK:      IPC_TEST_TX:   0. 0 %
CPU: mcu2_1: TASK:      IPC_TEST_TX:   0. 0 %
CPU: mcu2_1: TASK:      IPC_TEST_TX:   0. 0 %

CPU: mcu2_1: HEAP:    DDR_LOCAL_MEM: size =   16777216 B, free =   16773120 B ( 99 % unused)
CPU: mcu2_1: HEAP:           L3_MEM: size =     524288 B, free =     524288 B (100 % unused)

CPU:  c7x_1: TASK:      FREERTOS_TA:   0. 0 %
CPU:  c7x_1: TASK:           IPC_RX:   0. 5 %
CPU:  c7x_1: TASK:       REMOTE_SRV:   0. 0 %
CPU:  c7x_1: TASK:        LOAD_TEST:   0. 0 %
CPU:  c7x_1: TASK:      TIVX_C71_P1:  69.68 %
CPU:  c7x_1: TASK:      TIVX_C71_P2:   0. 0 %
CPU:  c7x_1: TASK:      TIVX_C71_P3:   0. 0 %
CPU:  c7x_1: TASK:      TIVX_C71_P4:   0. 0 %
CPU:  c7x_1: TASK:      TIVX_C71_P5:   0. 0 %
CPU:  c7x_1: TASK:      TIVX_C71_P6:   0. 0 %
CPU:  c7x_1: TASK:      TIVX_C71_P7:   0. 0 %
CPU:  c7x_1: TASK:      TIVX_C71_P8:   0. 0 %
CPU:  c7x_1: TASK:      IPC_TEST_RX:   0. 0 %
CPU:  c7x_1: TASK:      IPC_TEST_TX:   0. 0 %
CPU:  c7x_1: TASK:      IPC_TEST_TX:   0. 0 %
CPU:  c7x_1: TASK:      IPC_TEST_TX:   0. 0 %
CPU:  c7x_1: TASK:      IPC_TEST_TX:   0. 0 %

CPU:  c7x_1: HEAP:    DDR_LOCAL_MEM: size =  268435456 B, free =  215982080 B ( 80 % unused)
CPU:  c7x_1: HEAP:           L3_MEM: size =    3964928 B, free =          0 B (  0 % unused)
CPU:  c7x_1: HEAP:           L2_MEM: size =     458752 B, free =          0 B (  0 % unused)
CPU:  c7x_1: HEAP:           L1_MEM: size =      16384 B, free =          0 B (  0 % unused)
CPU:  c7x_1: HEAP:  DDR_SCRATCH_MEM: size =  385875968 B, free =  383435737 B ( 99 % unused)

CPU:  c7x_2: TASK:      FREERTOS_TA:   0. 0 %
CPU:  c7x_2: TASK:           IPC_RX:   0. 5 %
CPU:  c7x_2: TASK:       REMOTE_SRV:   0. 0 %
CPU:  c7x_2: TASK:        LOAD_TEST:   0. 0 %
CPU:  c7x_2: TASK:         TIVX_CPU:  21.23 %
CPU:  c7x_2: TASK:      IPC_TEST_RX:   0. 0 %
CPU:  c7x_2: TASK:      IPC_TEST_TX:   0. 0 %
CPU:  c7x_2: TASK:      IPC_TEST_TX:   0. 0 %
CPU:  c7x_2: TASK:      IPC_TEST_TX:   0. 0 %
CPU:  c7x_2: TASK:      IPC_TEST_TX:   0. 0 %

CPU:  c7x_2: HEAP:    DDR_LOCAL_MEM: size =   16777216 B, free =   16767488 B ( 99 % unused)
CPU:  c7x_2: HEAP:           L2_MEM: size =     458752 B, free =     458752 B (100 % unused)
CPU:  c7x_2: HEAP:           L1_MEM: size =      16384 B, free =      16384 B (100 % unused)
CPU:  c7x_2: HEAP:  DDR_SCRATCH_MEM: size =   67108864 B, free =   67108864 B (100 % unused)

一旦 发生冻结、UART 会断开连接并无响应、JTAG 无法连接、并且主域和 MCU 域(LD5和 LD6)的电源 LED 关闭。

在系统冻结之前、控制台上没有指示分段故障、内核崩溃或应用程序错误/警告的消息、以下情况除外:

 p: Print performance statistics

 0-3: Camera switch

 5: Camera auto switch

 q: 2D View

 w: 3D View

 x: Exit

 Enter Choice:
 
[21542.579784] audit: type=1334 audit(1744934406.656:29): prog-id=24 op=LOAD
[21542.694036] audit: type=1334 audit(1744934406.768:30): prog-id=24 op=UNLOAD
[26852.389441] tps6594 0-0048: Error IRQ trap reach ilim, overcurrent for BUCK1

您能指导我们如何进一步调试该问题吗?

谢谢

此致、
Christopher

  • 请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

    您好、

    这是一个漫长的周末、因此请希望下周初有响应。

    此致、
    Sudheer

  • 请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

    您好、

    我看不到上面的日志中的任何原因,因为系统可能停止工作.. 我认为 PMIC 错误"[26852.38941] tps6594 0-0048:错误 IRQ 陷阱达到 ILIM、BUCK1过流"会更早看到、不是吗?

    最有可能是某些损坏或某些无效访问导致整个系统停止。 所以我们需要更多详细信息。

    -重现此问题的难易程度如何?  

    -什么是客户视觉应用? 它包含哪些所有组件?  

    -什么是数据流?  

    -它运行多久好?  

    此致、

    Brijesh  

  • 请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

    您好 、Brijesh、

    感谢您的反馈。

    是的、 在系统(也是 UART)冻结之前会出现 PMIC 错误"[26852.38941] tps6594 0-0048:错误 IRQ 陷阱到达 ILIM、BUCK1过流"。

    对于您的问题、请查看以下答案:

    1.这个问题不容易重现,应用程序可以正常运行几个小时到几天。

    2、客户视觉应用程序同时运行物体检测(TIDL: yolov5s 640x416)和环视,显示屏采用三个管道进行不同的层,请参阅图形架构:  

    GRAPH: app_iavm_graph (#nodes =   8, #executions = 135698)
     NODE:       CAPTURE1:             capture_node: avg =  32845 usecs, min/max =  12580 /  62151 usecs, #executions =     135698
     NODE:      VPAC_LDC1:                 ldc_node: avg =  14352 usecs, min/max =  14261 /  14774 usecs, #executions =     135698
     NODE:          MPU-0:          OpenGL_SRV_Node: avg =  11439 usecs, min/max =  10431 /  43922 usecs, #executions =     135698
     NODE:       DISPLAY1:            Display_node1: avg =   8769 usecs, min/max =     63 /  17056 usecs, #executions =     135698
     NODE:      VPAC_MSC1:              scaler_node: avg =  21950 usecs, min/max =  21755 /  22701 usecs, #executions =     135698
     NODE:       DISPLAY2:            Display_node2: avg =  15096 usecs, min/max =     72 /  16745 usecs, #executions =     135698
     NODE:          DSP-1:              PreProcNode: avg =   7129 usecs, min/max =   6491 /   7770 usecs, #executions =     135698
     NODE:       DSP_C7-1:                tidl_node: avg =  23034 usecs, min/max =  20931 /  23742 usecs, #executions =     135698
    
    GRAPH: app_iavm_graph_gpu_lut (#nodes =   2, #executions =      1)
     NODE:          DSP-1:                 node_202: avg =    351 usecs, min/max =    351 /    351 usecs, #executions =          1
     NODE:          DSP-1:                 node_203: avg =  10805 usecs, min/max =  10805 /  10805 usecs, #executions =          1
    
    GRAPH: app_iavm_graph_disp_ovl (#nodes =   1, #executions = 135694)
     NODE:       DISPLAY2:            Display_node2: avg =  14212 usecs, min/max =     70 /  36725 usecs, #executions =     135694
    
     PERF:           FILEIO: avg =      0 usecs, min/max = 4294967295 /      0 usecs, #executions =          0
     PERF:            TOTAL: avg =  33330 usecs, min/max =  31842 /  34893 usecs, #executions =       7325
    
     PERF:            TOTAL:   30. 0 FPS

    另请参阅原始帖子中的资源加载信息。

    3.数据流如下:  

    4、 如第一点所述,有时可以正常运行几天,有时只需几个小时就会死机。

    谢谢。

    此致、

    Christopher

  • 请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

    您好 Christopher、

    再说一次、很难用这些信息找出问题。 我建议逐个节点删除节点、以查看 此节点是否影响总图形。  

    此致、

    Brijesh

  • 请注意,本文内容源自机器翻译,可能存在语法或其它翻译错误,仅供参考。如需获取准确内容,请参阅链接中的英语原文或自行翻译。

    您好 、Brijesh、

    我们还进行了以下实验。

    1.在 tidl_od_cam 应用程序中添加 srv 图形、该应用程序称为 tidl_od_cam_srv

    2.只需运行图形、无需任何额外 处理

    3.使用 stress-ng 运行 tidll_od_cam_srv

    在上述情况下、系统在执行几个小时后冻结。

    因此、我们怀疑该问题是由高系统负载引起的。 如果我们移除节点以减少负载、则问题可能不会再次出现。"

    由于系统冻结、UART 和 JTAG 均未响应、并且 LD5/6指示器关闭。 是否有办法初步确定这是软件还是硬件问题? 例如、我们的 EVM 可能会损坏吗?

    而且、如果 PMIC 的错误消息指示系统稍后可能会复位?

    谢谢、

    此致、

    Christopher