This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

6670运行多核程序,不开优化正常,开O3优化运行错误



6670,同一个程序运行在4个核上,不开优化(optimization disabled)程序运行正常,开了O3优化之后运行错误。
 
环境:
 
    TMDSEVM6670LE, Rev 3A 评估板
    CCS版本:5.2.1.00018, 编译器版本:TI v7.3.4
    仿真器: XDS560V2-USB Mezzanine
 
具体情况如下:
调用函数中,先是由core0做一些初始化,然后给同步标志置位,其余的核轮询该标志,如果完成就继续往下走,否则等待:
void func(void)
{
...
if(DNUM == 0)
{
 
for(i=0; i<NUM_SYNC_STAGES; i++)
for(j=0; j<NUM_CORE; j++)
bSyncCores[i][j] = false;
// first stage OK, only done by core0, to give the bSyncCores[][] a valid value
bSyncCores[SYNC_INIT][0] = true;
CACHE_wbL1d (&bSyncCores[0][0], NUM_SYNC_STAGES*NUM_CORE*sizeof(bool), CACHE_WAIT);
}
// for other cores, wait
currentStage = SYNC_INIT;
waitSyncFlag(currentStage, 0, 1);           // for core 1, 2, 3, stops here when using O3 optimization
...
}
----------------------------------------------------------------------------------------------------------------------------
我在共享内存里用一个数组bSyncCores[ ][ ]来存同步信号,用两个函数来设置/等待信号,大致是:

void setSyncFlag(int stage, Uint32 core, bool value)

{

CACHE_invL1d(&bSyncCores[0][0], sizeof(bool)*NUM_SYNC_STAGES*NUM_CORE, CACHE_WAIT);

bSyncCores[stage][core] = value;

CACHE_wbL1d (&bSyncCores[stage][core], sizeof(bool), CACHE_WAIT);

}

void waitSyncFlag(int stage, Uint32 core, bool value)

{

do

{

CACHE_invL1d(&bSyncCores[0][0], sizeof(bool)*NUM_SYNC_STAGES*NUM_CORE, CACHE_WAIT);

}while(bSyncCores[stage][core] != value);

//也试过下面这种方法

//while(bSyncCores[stage][core] != value)

//CACHE_invL1d(&bSyncCores[0][0], sizeof(bool)*NUM_SYNC_STAGES*NUM_CORE, CACHE_WAIT);

}

其中stage取值如下:

enum SYNC_STAGES {

SYNC_INIT = 0,

SYNC_INTEGRAL,

SYNC_DETECTION,

SYNC_MERGE_MARKS,

SYNC_MERGE_WINDOWS,

SYNC_CLASSIFY,

NUM_SYNC_STAGES

};

core取值从0到3,就是DNUM,value就是true或者false。

 

看起来可能是个cache之类的问题,在共享内存中的变量应该被设置正确了,但可能读取的时候没有得到正确值。