[参考译文] TDA4VM：C7X 流引擎：提取交错数据

admin

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1240841/tda4vm-c7x-streaming-engine-extracting-interleaved-data

器件型号：TDA4VM

您好！

我希望使用流引擎从传入缓冲区中提取交错的数据。

数据格式如下：

                WIDTH
[a,b,c,d],[a,b,c,d],...[a,b,c,d],
[a,b,c,d],[a,b,c,d],...[a,b,c,d],     HEIGHT
...
[a,b,c,d],[a,b,c,d],...[a,b,c,d]

我希望一次将16个元素提取到3个向量 A、B、C 中(我不关心第4个元素)。

以下是我的描述符设置：

static inline __SE_TEMPLATE_v1 InterleaveTemplate(const int WIDTH, const int HEIGHT)
{
    __SE_TEMPLATE_v1 cfg = __gen_SE_TEMPLATE_v1();
    cfg.VECLEN = __SE_VECLEN_16ELEMS;   // Stream 16 elements per advance
    cfg.ELETYPE = __SE_ELETYPE_32BIT;   // Each element is 32 bits (float)
    cfg.ICNT0 = 1;                      // Get one interleaved element (a, b, or c. Skip d.)
    cfg.DIM1 = 4;                       // Go to next interleaved element (a, b, or c. Skip d.)
    cfg.ICNT1 = 16;                     // 16 for each a, b, and c
    cfg.DIM2 = 1;                       // Go to next element (a, b, or c. Skip d.)
    cfg.ICNT2 = 3;                      // Do a, b, c
    cfg.DIM3 = 16 * 4;                  // Advance to the next 16 elements
    cfg.ICNT3 = WIDTH * HEIGHT / 16;    // Repeat for each element, 16 at a time
    cfg.DIMFMT = __SE_DIMFMT_4D;        // Streaming 4 dimensions
    return cfg;
}

代码如下所示：

__SE_TEMPLATE_v1 cfg = InterleavedTemplate(BLOCK_WIDTH, BLOCK_HEIGHT);
__SE0_OPEN(ptr, cfg);

const int N_ITERATIONS = BLOCK_WIDTH * BLOCK_HEIGHT / 16;
for (int i = 0; i < N_ITERATIONS; i++)
{
    float16 a = __SE0ADV(float16);
    float16 b = __SE0ADV(float16);
    float16 c = __SE0ADV(float16);
    
    /* rest of code */
}

__SE0_CLOSE();

然而,在测试之后,它看起来像 ICNT0必须为16才能在目标向量中获得16个元素,其中包含一个__SE0_ADV()。是这样吗？如果是这种情况、实现我的目标更好的选择是什么？

谢谢。

弗雷德

2 年多前

0 admin 2 年多前

TI__Guru**** 2538955 points

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

泵

处理器（参考译文帖）

处理器（参考译文帖）(Read Only)

[参考译文] TDA4VM：C7X 流引擎：提取交错数据