This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6678的4核和8核的效率为什么是相当的?



   芯片是C6678 8核的配置,最近想测试下6678 OpenMP的并行度,我使用了例程openmp的MatrixE例程,稍加了修改,分析让它在单核和4核,8核情况下运行,看看4核和8核效率分别能提高多少。

   代码如下和.cfg,如附件。结果我发现4核对比单核针对这个矩阵算法能提高3倍左右,但是8核的效率和4核的效率基本是一致的。

  而且我也验证了,8核的情况下的确是8个核都参与了计算。

打印分别如下:

4  核:

------one core process cost time, 39.714241 ms------
-----Core 0 process-----550292684800, SIZE, 1024
------four core process cost time, 13.928676 ms------

Matrix-vector total - sum of all c[] = 550292684800, size 1024, MaxCoreNum:, 4

8核:

------one core process cost time, 39.717369 ms------
-----Core 0 process-----550292684800, SIZE, 1024
------four core process cost time, 13.782308 ms------

Matrix-vector total - sum of all c[] = 550292684800, size 1024, MaxCoreNum:, 8

  MaxCoreNum是当前运行的线程数,也就是核数。8核的中间的运行过程我也打印出来了,在8核时,每个核都有在处理的。

 可以看到8核的耗时与4核是一致的,都是13.7ms左右,单核在39ms左右。

 TI的专家们能不能帮我解答下。

OpenMPTest.zip