优化选项对程序的影响？？？

Wenguo Li1

使用平台：CCS5.4, EVM6678开发板

以下的函数的作用是计算FFT之后信号的snr和sfdr值，函数的主要思想是计算单精度复数的模。
#pragma CODE_SECTION(Compute_Analyse, ".text");
void Compute_sfdr_snr(const float *restrict x,int nx,double *sfdr,double *snr)
{
int i;
double max0, max1, max2, max3,first_max,second_max;
double x_01, x_23,x_45, x_67;
double sum0 = 0, sum1 = 0, sum2 = 0, sum3 = 0,summation = 0;

max0 = -DBL_MAX;
max1 = -DBL_MAX;
max2 = -DBL_MAX;
max3 = -DBL_MAX;
first_max = -DBL_MAX;
second_max = -DBL_MAX;

_nassert(nx % 8 == 0);
_nassert(nx > 0);
_nassert((int)x % 8 == 0);

#pragma MUST_ITERATE(1,,)
for (i = 0; i < nx; i+=8)
{
x_01 = _amemd8((void*)&x[i]);
x_23 = _amemd8((void*)&x[i+2]);
x_45 = _amemd8((void*)&x[i+4]);
x_67 = _amemd8((void*)&x[i+6]);

sum0 = sqrtdp_i((double)_lof(x_01)*_lof(x_01) + (double)_hif(x_01)* _hif(x_01)); //计算平方和的平方根
sum1 = sqrtdp_i((double)_lof(x_23)*_lof(x_23) + (double)_hif(x_23)* _hif(x_23));
sum2 = sqrtdp_i((double)_lof(x_45)*_lof(x_45) + (double)_hif(x_45)* _hif(x_45));
sum3 = sqrtdp_i((double)_lof(x_67)*_lof(x_67) + (double)_hif(x_67)* _hif(x_67));

max0 = MAX_FLAOT(sum0,max0); //找最大值
max1 = MAX_FLAOT(sum1,max1);
max2 = MAX_FLAOT(sum2,max2);
max3 = MAX_FLAOT(sum3,max3);

summation += (sum0 + sum1 + sum2 + sum3);

}
max1 = MAX_FLAOT(max0,max1);
max3 = MAX_FLAOT(max2,max3);

if (max3 > max1)
{
first_max = max3;
second_max = max1;
}
else
{
first_max = max1;
second_max = max3;
}

_amemd8((void*)sfdr) = (first_max - second_max);
_amemd8((void*)snr) = (summation - first_max);
}

开了--opt_level=3之后，编译速度非常慢，并且运行周期大幅增加。（cycles = 1581474）
但是如果注释掉

_amemd8((void*)sfdr) = (first_max - second_max);

_amemd8((void*)snr) = (summation - first_max);

之后，很快就可以通过编译，运行周期也会下降很多（cycles = 412228）。

_amemd8((void*)sfdr) = (first_max - second_max);
和 _amemd8((void*)snr) = (summation - first_max);

这两句就是将计算结果通过形参会传到固定的地址上，这种现象和开了--opt_level=3
优化到底有什么联系？难道开了优化之后，形参会传回传会变得很耗时间吗？

我试过使用return返回其中的一个参数例如 return (first_max - second_max)；
然后再函数调用的时候使用 sfdr = Compute_sfdr_snr（ x,nx）；看到的现象和使用形参时的情况是一样的，

但是我把函数调用改成Compute_sfdr_snr（ x,nx）；
不接受返回值，编译很快，并且运行周期也下降很多。

在上面的例子中，到底应该采用什么方式来实现计算结果回传？
还有，在使用同样的代码的情况下，在开了优化和不开优化，编译的时间为什有有很大的差别？

11 年多前

0 Allen35065 11 年多前

TI__Mastermind 27075 points

大致上，你如果不把结果存下来，那么那些计算根本不会运行，它们都会被编译器认为是无用的语句优化掉！

所以你看到不存数据会使cycle数降低。

开优化编译时间长是正常的，因为有太多的信息需要分析才能排流水。

处理器

处理器论坛

优化选项对程序的影响？？？