J784S4XEVM: app_tidl_od模型流水线优化

峰林

Prodigy 20 points

Part Number: J784S4XEVM

J784S4XEVM里面的app_tidl_od模型已经做了yolov5的移植，如图

现在就是对这个模型进行流水线优化，该怎么进行流水线优化？

2 年多前

0 Shine 2 年多前

TI__Guru**** 357097 points

已把您的问题升级到英文e2e论坛，请关注帖子的回复。
https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1256943/j784s4xevm-pipeline-optimization-for-app_tidl_od-model

0 Shine 2 年多前

TI__Guru**** 357097 points

请看下面e2e工程师的回复。

Could you elaborate more here ?

May I know the use case you are looking for ?

0 峰林 2 年多前回复 Shine

Prodigy 20 points

就是想要这个app_tidl_od这个demo能够运行更快一点，优化一点，能够让4个DSP同时运行app_tidl_od模型。

0 Shine 2 年多前回复峰林

TI__Guru**** 357097 points

已跟进，请关注帖子的回复。

0 Shine 2 年多前回复峰林

TI__Guru**** 357097 points

抱歉回复晚了，请看下面e2e工程师最新的回复。

Firstly, on 8.6 SDK TIDL Node supports single core inference, so single node cant be divide to run on multiple core instance (its 1-1 dependency)

Here is the brief about C7x Core loads :

On first C7x core the TIDL OpenVx node is running and thus you can see the 31% load, on second C7x core the post processing OpenVx Node is running as result you can see the 17% load coming on the same.

Now, on our latest 9.0 release TIDL layer support batch processing, i.e if there are multiple batches available for model inference each batch can be scheduled to run on single C7x core instance, for TDA4VH SoC there are 4 C7x cores available so 4 cores can be utilized simultaneously if there are 4 or more batch available.

Furthermore, if you have single batch instance and still want to consume all the C7x core, this support will be added in 9.1 release that will bring the model inference level parallelism.

处理器

处理器论坛

J784S4XEVM: app_tidl_od模型流水线优化