TI工程师,您好
采用主从核的模式做多核分区计算矩阵乘法,0核为主核1~7核为从核,结果分区写入内存,代码如下:
bool stepCal(Uint32 COREID){
int i = 0;
int inputIndex2 = 0, outputIndex3 = 0;
int procRowNumber = 0;
double* s_step_loc = malloc(2 * (CHANNEL + 1) * sizeof(double));
double* trans_JR_loc = malloc(2 * (CHANNEL + 1) * sizeof(double));
double* inv_M_loc = malloc((CHANNEL + 1) * (CHANNEL + 1) * sizeof(double));
memset(s_step_loc, 0, 2 * (CHANNEL + 1) * sizeof(double));
memset(trans_JR_loc, 0, 2 * (CHANNEL + 1) * sizeof(double));
memset(inv_M_loc, 0, (CHANNEL + 1) * (CHANNEL + 1) * sizeof(double));
memcpy(trans_JR_loc, trans_JR, 2 * (CHANNEL + 1) * sizeof(double));
memcpy(inv_M_loc, inv_M, (CHANNEL + 1) * (CHANNEL + 1) * sizeof(double));
/*按核编号,分配处理地址范围(工作量)*/
if(COREID == 7){
inputIndex2 = (CHANNEL + 1 - 6 * (8 - COREID)) * (CHANNEL + 1);
outputIndex3 = (CHANNEL + 1 - 6 * (8 - COREID)) * 2;
procRowNumber = 6;
}
else if(COREID == 5 || COREID == 6) {
inputIndex2 = (CHANNEL + 1 - 6 * (8 - COREID)) * (CHANNEL + 1);
outputIndex3 = (CHANNEL + 1 - 6 * (8 - COREID)) * 2;
procRowNumber = 6;
}
else {
inputIndex2 = COREID * 4 * (CHANNEL + 1);
outputIndex3 = COREID * 4 * 2;
procRowNumber = 4;
}
/*矩阵求逆结束后,多核计算步长,由于inv_M矩阵的对称性,不需要转置*/
DSPF_dp_mat_mul_gemm((double*)trans_JR_loc, 1, 2, (CHANNEL + 1), (double*)(inv_M_loc + inputIndex2), procRowNumber, (double*)(s_step_loc + outputIndex3));
for(i = outputIndex3 / 2; i < outputIndex3 / 2 + procRowNumber; i++){
*(s_step + i) = *(s_step_loc + 2 * i);
}
Cache_wbInvAll();
free(s_step_loc);
free(trans_JR_loc);
free(inv_M_loc);
return true;
}
其中,trans_JR,inv_M,s_step 是全局指针变量
函数运行后,每个核用Memory Browser看到的结果一致,如下:

其中,s_step是double型指针,应该有37个非0数据。但是从Memory Browser看到的结果,每个核看到的前4个数据都是0(0核负责前4个核)。请问为什么会发生这个错误
以及,对于1~7核(从核),取消勾选L1D Cache或 L2Cache,Memory Browser看到的内容如下:

请问,Memory Browser看到的是哪一块存储空间的内容(L1D L2 MSMC)?