This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
求解load 在CMD文件中定义数据段时的具体含义
现在4K,8K,6K,64k时,都可以load到DDR3中,但是存在以下问题:在定义DDR3段时,用不用load语句,指令运行周期数差别比较大,请大神分析原因。
以64k的复数浮点FFT为例,
定义DDR3段的时候采用.DDR3:load>>DDR3 采用load指令时指令周期数CYCLES:3096024
定义DDR3段的时候采用.DDR3:>>DDR3没有采用load指令时指令周期数CYCLES:54106352
相差数十倍,求原因。
测试结果如下:
数据存放在DDR3中 定义DDR3段的时候采用.DDR3:load>>DDR3 采用load指令
FFT点数 运行时间 运行指令周期数数
N = 8 radix = 2 TIME: 0.215000 us CYCLES:215
N = 16 radix = 4 TIME: 0.243000 us CYCLES:243
N = 32 radix = 2 TIME: 0.401000 us CYCLES:401
N = 64 radix = 4 TIME: 0.617000 us CYCLES:617
N = 128 radix = 2 TIME: 1.207000 us CYCLES:1207
N = 256 radix = 4 TIME: 2.168000 us CYCLES:2168
N = 512 radix = 2 TIME: 4.793000 us CYCLES:4793
N = 1024 radix = 4 TIME: 9.256000 us CYCLES:9256
N = 2048 radix = 2 TIME: 21.034000 us CYCLES:21034
N = 4096 radix = 4 TIME: 51.208000 us CYCLES:51208
N = 8192 radix = 2 TIME: 266.581000 us CYCLES:266581
N = 16384 radix = 4 TIME: 539.959000 us CYCLES:539959
N = 32768 radix = 2 TIME: 1512.223000 us CYCLES:1512223
N = 65536 radix = 4 TIME: 3096.024000 us CYCLES:3096024
数据存放在DDR3中 定义DDR3段的时候采用.DDR3:>>DDR3没有采用load指令
FFT点数 运行时间 运行指令周期数数
N = 8 radix = 2 TIME: 1.603000 us CYCLES:1603
N = 16 radix = 4 TIME: 3.136000 us CYCLES:3136
N = 32 radix = 2 TIME: 9.645000 us CYCLES:9645
N = 64 radix = 4 TIME: 19.355000 us CYCLES:19355
N = 128 radix = 2 TIME: 52.260000 us CYCLES:52260
N = 256 radix = 4 TIME: 104.660000 us CYCLES:104660
N = 512 radix = 2 TIME: 260.284000 us CYCLES:260284
N = 1024 radix = 4 TIME: 480.613000 us CYCLES:480613
N = 2048 radix = 2 TIME: 1040.438000 us CYCLES:1040438
N = 4096 radix = 4 TIME: 2064.545000 us CYCLES:2064545
N = 8192 radix = 2 TIME: 4781.424000 us CYCLES:4781424
N = 16384 radix = 4 TIME: 10776.016000 us CYCLES:10776016
N = 32768 radix = 2 TIME: 25084.586000 us CYCLES:25084586
N = 65536 radix = 4 TIME: 54106.352000 us CYCLES:54106352
两种情况的cmd文件仅仅在定义段.DDR3时加与,不加load
定义段.DDR3时,加load的CMD文件
-heap 0x8000 -stack 0xC000
MEMORY {
L2SRAM (RWX) : org = 0x800000, len = 0x100000
MSMCSRAM (RWX) : org = 0xc000000, len = 0x200000
DDR3(RWX) : org = 0x80000000,len = 0x10000000
}
SECTIONS {
.kernel: {
*.obj (.text:optimized) { SIZE(_kernel_size) }
}
.text: load >> L2SRAM
.text:touch: load >> L2SRAM
GROUP (NEAR_DP)
{
.neardata .rodata .bss
} load > L2SRAM
.far: load >> L2SRAM
.fardata: load >> L2SRAM
.data: load >> L2SRAM
.switch: load >> L2SRAM
.stack: load > L2SRAM
.args: load > L2SRAM align = 0x4, fill = 0 {_argsize = 0x200; }
.sysmem: load > L2SRAM
.cinit: load > L2SRAM
.const: load > L2SRAM START(const_start) SIZE(const_size)
.pinit: load > L2SRAM .
cio: load >> L2SRAM
xdc.meta: load >> L2SRAM, type = COPY
//.MSMCSRAM: >> MSMCSRAM
.DDR3:load>> DDR3
}
定义段.DDR3时,不加load的CMD文件
-heap 0x8000 -stack 0xC000
MEMORY {
L2SRAM (RWX) : org = 0x800000, len = 0x100000
MSMCSRAM (RWX) : org = 0xc000000, len = 0x200000
DDR3(RWX) : org = 0x80000000,len = 0x10000000
}
SECTIONS {
.kernel: {
*.obj (.text:optimized) { SIZE(_kernel_size) }
}
.text: load >> L2SRAM
.text:touch: load >> L2SRAM
GROUP (NEAR_DP)
{
.neardata .rodata .bss
} load > L2SRAM
.far: load >> L2SRAM
.fardata: load >> L2SRAM
.data: load >> L2SRAM
.switch: load >> L2SRAM
.stack: load > L2SRAM
.args: load > L2SRAM align = 0x4, fill = 0 {_argsize = 0x200; }
.sysmem: load > L2SRAM
.cinit: load > L2SRAM
.const: load > L2SRAM START(const_start) SIZE(const_size)
.pinit: load > L2SRAM .
cio: load >> L2SRAM
xdc.meta: load >> L2SRAM, type = COPY
//.MSMCSRAM: >> MSMCSRAM
.DDR3:>> DDR3
}