J784S4XEVM里面的app_tidl_od模型已经做了yolov5的移植,如图
现在就是对这个模型进行流水线优化,该怎么进行流水线优化?
This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
抱歉回复晚了,请看下面e2e工程师最新的回复。
Firstly, on 8.6 SDK TIDL Node supports single core inference, so single node cant be divide to run on multiple core instance (its 1-1 dependency)
Here is the brief about C7x Core loads :
On first C7x core the TIDL OpenVx node is running and thus you can see the 31% load, on second C7x core the post processing OpenVx Node is running as result you can see the 17% load coming on the same.
Now, on our latest 9.0 release TIDL layer support batch processing, i.e if there are multiple batches available for model inference each batch can be scheduled to run on single C7x core instance, for TDA4VH SoC there are 4 C7x cores available so 4 cores can be utilized simultaneously if there are 4 or more batch available.
Furthermore, if you have single batch instance and still want to consume all the C7x core, this support will be added in 9.1 release that will bring the model inference level parallelism.