This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

如何测试DSP6670 FFTC协处理器的吞吐量?如何有效地去除循环前缀



1)FFTC user guide文档上说,若加入循环前缀去除操作,一个packet只能处理一个block,但是这种方法会大大降低FFT的处理速度,请问有什么好的办法吗?

2)根据DSP 6670的FFTC吞吐量测试的说明,需要用10个或20个packet使FFTC完全工作起来,怎么实现呢?又没哟具体的例子?我在evm6670上测试的单个packet消耗的cycle数就是7000多cycle,例子用的就是TI提供的

C:\ti\pdk_C6670_1_1_2_6\packages\ti\drv\fftc\example\
FFTC_Multicore_exampleProject


[C66xx_2] **************************************************
[C66xx_2] ******** FFTC Multi Core Example Start **********
[C66xx_2] **************************************************
[C66xx_2] Core 2 : L1D cache size 7. L2 cache size 0.
[C66xx_2] [Core 2]: Waiting for Sys Init to be completed ...
[C66xx_3] **************************************************
[C66xx_3] ******** FFTC Multi Core Example Start **********
[C66xx_3] **************************************************
[C66xx_3] Core 3 : L1D cache size 7. L2 cache size 0.
[C66xx_3] [Core 3]: Waiting for Sys Init to be completed ...
[C66xx_0] **************************************************
[C66xx_1] **************************************************
[C66xx_0] ******** FFTC Multi Core Example Start **********
[C66xx_1] ******** FFTC Multi Core Example Start **********
[C66xx_0] **************************************************
[C66xx_1] **************************************************
[C66xx_0] Core 0 : L1D cache size 7. L2 cache size 0.
[C66xx_1] Core 1 : L1D cache size 7. L2 cache size 0.
[C66xx_0] [Core 0]: FFTC instance 0 successfully initialized
[C66xx_1] [Core 1]: Waiting for Sys Init to be completed ...
[C66xx_0] [Core 0]: FFTC successfully opened
[C66xx_1] [Core 1]: FFTC successfully opened
[C66xx_0] --------------------------------------------
[C66xx_1] --------------------------------------------
[C66xx_0] FFTC-CPPI Example START on Core 0
[C66xx_1] FFTC-CPPI Example START on Core 1
[C66xx_0] Sample Size: 16
[C66xx_1] Sample Size: 16
[C66xx_0] Number of Blocks: 5
[C66xx_1] Number of Blocks: 5
[C66xx_0] Tx Queue: 0
[C66xx_1] Tx Queue: 0
[C66xx_0] Descriptor Type: Host
[C66xx_1] Descriptor Type: Host
[C66xx_0] --------------------------------------------
[C66xx_1] --------------------------------------------
[C66xx_0] [Core 0]: Rx flow 0 opened successfully using Rx queue 708
[C66xx_1] [Core 1]: Rx flow 1 opened successfully using Rx queue 709
[C66xx_2] [Core 2]: FFTC successfully opened
[C66xx_3] [Core 3]: FFTC successfully opened
[C66xx_2] --------------------------------------------
[C66xx_3] --------------------------------------------
[C66xx_2] FFTC-CPPI Example START on Core 2
[C66xx_3] FFTC-CPPI Example START on Core 3
[C66xx_2] Sample Size: 16
[C66xx_3] Sample Size: 16
[C66xx_2] Number of Blocks: 5
[C66xx_3] Number of Blocks: 5
[C66xx_2] Tx Queue: 0
[C66xx_3] Tx Queue: 0
[C66xx_2] Descriptor Type: Host
[C66xx_3] Descriptor Type: Host
[C66xx_2] --------------------------------------------
[C66xx_3] --------------------------------------------
[C66xx_2] [Core 2]: Rx flow 2 opened successfully using Rx queue 710
[C66xx_3] [Core 3]: Rx flow 3 opened successfully using Rx queue 711
[C66xx_3]
[C66xx_3] [Core 3]: Submitting FFT Request ...
[C66xx_3] [Core 3]: Submitted request 0
[C66xx_3]
[C66xx_3] [Core 3]: Waiting for Result ...
[C66xx_0]
[C66xx_1]
[C66xx_2]
[C66xx_0] [Core 0]: Submitting FFT Request ...
[C66xx_1] [Core 1]: Submitting FFT Request ...
[C66xx_2] [Core 2]: Submitting FFT Request ...
[C66xx_0] [Core 0]: Submitted request 0
[C66xx_1] [Core 1]: Submitted request 0
[C66xx_2] [Core 2]: Submitted request 0
[C66xx_0]
[C66xx_1]
[C66xx_2]
[C66xx_0] [Core 0]: Waiting for Result ...
[C66xx_1] [Core 1]: Waiting for Result ...
[C66xx_2] [Core 2]: Waiting for Result ...
[C66xx_0] cycle=6636
[C66xx_1] cycle=7251
[C66xx_2] cycle=7250

  • 去CP的操作可能是会引起速率的下降,这也是没有办法的,当然如果多个packet连续送入的话因为FFTC内部的流水机制,速率不会下降太多。

    FFTC内部是三个buffer的流水操作,一个packet要经历读数据,计算,写数据三部分,自然时延比较大;如果多个包持续送入,三个buffer可以流水作业,读数据和计算,写数据部分并行执行,就可以大大缩短每个包的平均时延,因此多个包连续做FFTC的时间总和要远远短于不连续的包的总和。