[参考译文] MSP430FR5994：MSP430上的矩阵乘法，带和不带 LEA

admin

Other Parts Discussed in Thread: MSP430FR5994

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

https://e2e.ti.com/support/microcontrollers/msp-low-power-microcontrollers-group/msp430/f/msp-low-power-microcontroller-forum/1065973/msp430fr5994-matrix-multiplication-on-the-msp430-with-and-without-lea

部件号：MSP430FR5994

我正在尝试为 MSP430FR5994上的多个基质实施矩阵乘法。我在论坛上提出了几个旧问题后，使用了上面提到的答案来编写我的实施代码。其目的是复制神经网络层，因此，计算涉及输入矩阵的矩阵乘以另一个包含网络权重的矩阵，然后添加另一个包含神经网络偏置值的矩阵。在执行这些操作时，我意识到需要对这些值进行量化，并在将输入，权重或偏差填入矩阵之前进行量化。我目前遇到的问题是矩阵计算结果在存储到结果矩阵之前被右移1位15次。我知道，这种行为与“_Q15”参数的处理方式一致，同时也查看了执行此转换的代码。以下问题中提供了一个消除这种转变的可行解决方案- https://e2e.ti.com/support/microcontrollers/msp-low-power-microcontrollers-group/msp430/f/msp-low-power-microcontroller-forum/716353/msp430fr5992-msp-dsplib-msp_matrix_mpy_q15 -但是，此处未提及使用 MSP LEA 的解决方案。我尝试了一些改变乘法函数的方法，它会使用 Int16/t/uint16_t 值而不是_Q15参数。修改后的矩阵乘法函数-包含上述问题中提到的更改-如下所示：

/* --COPYRIGHT--,BSD
 * Copyright (c) 2016, Texas Instruments Incorporated
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 * *  Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 *
 * *  Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 *
 * *  Neither the name of Texas Instruments Incorporated nor the names of
 *    its contributors may be used to endorse or promote products derived
 *    from this software without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
 * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
 * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
 * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
 * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
 * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
 * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
 * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
 * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 * --/COPYRIGHT--*/

#include "../../include/DSPLib.h"

#if defined(MSP_USE_LEA)

msp_status msp_matrix_mpy_q15(const msp_matrix_mpy_q15_params *params, const uint16_t *srcA, const uint16_t *srcB, uint16_t *dst)
{
    uint16_t srcARows;
    uint16_t srcACols;
    uint16_t srcBRows;
    uint16_t srcBCols;
    msp_status status;
    MSP_LEA_MPYMATRIXROW_PARAMS *leaParams;

    /* Initialize the row and column sizes. */
    srcARows = params->srcARows;
    srcACols = params->srcACols;
    srcBRows = params->srcBRows;
    srcBCols = params->srcBCols;

#ifndef MSP_DISABLE_DIAGNOSTICS
    /* Check that column of A equals rows of B */
    if (srcACols != srcBRows) {
        return MSP_SIZE_ERROR;
    }

    /* Check that the data arrays are aligned and in a valid memory segment. */
    if (!(MSP_LEA_VALID_ADDRESS(srcA, 4) &
          MSP_LEA_VALID_ADDRESS(srcB, 4) &
          MSP_LEA_VALID_ADDRESS(dst, 4))) {
        return MSP_LEA_INVALID_ADDRESS;
    }

    /* Acquire lock for LEA module. */
    if (!msp_lea_acquireLock()) {
        return MSP_LEA_BUSY;
    }
#endif //MSP_DISABLE_DIAGNOSTICS

    /* Initialize LEA if it is not enabled. */
    if (!(LEAPMCTL & LEACMDEN)) {
        msp_lea_init();
    }

    /* Allocate MSP_LEA_MPYMATRIXROW_PARAMS structure. */
    leaParams = (MSP_LEA_MPYMATRIXROW_PARAMS *)msp_lea_allocMemory(sizeof(MSP_LEA_MPYMATRIXROW_PARAMS)/sizeof(uint32_t));

    /* Set status flag. */
    status = MSP_SUCCESS;

    /* Iterate through each row of srcA */
    while (srcARows--) {
        /* Set MSP_LEA_MPYMATRIXROW_PARAMS structure. */
        leaParams->rowSize = srcBRows;
        leaParams->colSize = srcBCols;
        leaParams->colVector = MSP_LEA_CONVERT_ADDRESS(srcB);
        leaParams->output = MSP_LEA_CONVERT_ADDRESS(dst);

        /* Load source arguments to LEA. */
        LEAPMS0 = MSP_LEA_CONVERT_ADDRESS(srcA);
        LEAPMS1 = MSP_LEA_CONVERT_ADDRESS(leaParams);

        /* Invoke the LEACMD__MPYMATRIXROW command with interrupts enabled. */
        LEAPMCB = LEACMD__MPYMATRIXROW | LEAITFLG1;

        /* Clear DSPLib flags, restore interrupts and enter LPM0. */
        msp_lea_ifg = 0;
        msp_lea_enterLPM();

#ifndef MSP_DISABLE_DIAGNOSTICS
        /* Check LEA interrupt flags for any errors. */
        if (msp_lea_ifg & LEACOVLIFG) {
            status = MSP_LEA_COMMAND_OVERFLOW;
            break;
        }
        else if (msp_lea_ifg & LEAOORIFG) {
            status = MSP_LEA_OUT_OF_RANGE;
            break;
        }
        else if (msp_lea_ifg & LEASDIIFG) {
            status = MSP_LEA_SCALAR_INCONSISTENCY;
            break;
        }
#endif //MSP_DISABLE_DIAGNOSTICS

        /* Increment srcA and dst pointers. */
        srcA += srcACols;
        dst += srcBCols;
    }

    /* Free MSP_LEA_MPYMATRIXROW_PARAMS structure. */
    msp_lea_freeMemory(sizeof(MSP_LEA_MPYMATRIXROW_PARAMS)/sizeof(uint32_t));

    /* Free lock for LEA module and return status. */
    msp_lea_freeLock();
    return status;
}

#else //MSP_USE_LEA

msp_status msp_matrix_mpy_q15(const msp_matrix_mpy_q15_params *params, const uint16_t *srcA, const uint16_t *srcB, uint16_t *dst)
{
    uint16_t cntr;
    uint16_t srcARows;
    uint16_t srcACols;
    uint16_t srcBRows;
    uint16_t srcBCols;
    uint16_t dst_row;
    uint16_t dst_col;
    uint16_t row_offset;
    uint16_t col_offset;
    uint16_t dst_row_offset;

    /* Initialize the row and column sizes. */
    srcARows = params->srcARows;
    srcACols = params->srcACols;
    srcBRows = params->srcBRows;
    srcBCols = params->srcBCols;

#ifndef MSP_DISABLE_DIAGNOSTICS
    /* Check that column of A equals rows of B */
    if (srcACols != srcBRows) {
        return MSP_SIZE_ERROR;
    }
#endif //MSP_DISABLE_DIAGNOSTICS

    /* In initialize loop counters. */
    cntr = 0;
    dst_row = 0;
    dst_col = 0;
    row_offset = 0;
    col_offset = 0;
    dst_row_offset = 0;

#if defined(__MSP430_HAS_MPY32__)
    /* If MPY32 is available save control context, set to fractional mode, set saturation mode. */
    uint16_t ui16MPYState = MPY32CTL0;
    MPY32CTL0 = MPYFRAC | MPYDLYWRTEN | MPYSAT;

    /* Loop through all srcA rows. */
    while(srcARows--) {
        /* Loop through all srcB columns. */
        while (dst_col < srcBCols) {
            /* Reset result accumulator. */
            MPY32CTL0 &= ~MPYC;
            RESLO = 0; RESHI = 0;
            
            /* Loop through all elements in srcA column and srcB row. */
            while(cntr < srcACols) {
                MACS = srcA[row_offset + cntr];
                OP2 = srcB[col_offset + dst_col];
                col_offset += srcBCols;
                cntr++;
            }
            
            /* Store the result */
            dst[dst_row_offset + dst_col] = RESHI * 32768 + RESLO;

            /* Update pointers. */
            dst_col++;
            cntr = 0;
            col_offset = 0;
        }

        /* Update pointers. */
        dst_row++;
        dst_col = 0;
        row_offset += srcACols;
        dst_row_offset += srcBCols;
    }

    /* Restore MPY32 control context, previous saturation state. */
    MPY32CTL0 = ui16MPYState;

#else //__MSP430_HAS_MPY32__
    uint32_t result;

    /* Loop through all srcA rows. */
    while(srcARows--) {
        /* Loop through all srcB columns. */
        while (dst_col < srcBCols) {
            /* Initialize accumulator. */
            result = 0;
            
            /* Loop through all elements in srcA column and srcB row. */
            while(cntr < srcACols) {
                result += (int32_t)srcA[row_offset + cntr] * (int32_t)srcB[col_offset + dst_col];
                col_offset += srcBCols;
                cntr++;
            }

            /* Saturate and store the result */
            dst[dst_row_offset + dst_col] = (int32_t)__saturate(result, INT32_MIN, INT32_MAX);

            /* Update pointers. */
            dst_col++;
            cntr = 0;
            col_offset = 0;
        }

        /* Update pointers. */
        dst_row++;
        dst_col = 0;
        row_offset += srcACols;
        dst_row_offset += srcBCols;
    }
#endif //__MSP430_HAS_MPY32__

    return MSP_SUCCESS;
}

#endif //MSP_USE_LEA

尽管将矩阵的输入类型更改为'uint16_t' ，并通过消除移动15修改结果的存储方式，但代码仍然无法正确计算整数格式的矩阵值。矩阵乘法的完整代码如下：

#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <msp430.h>
#include "DSPLib.h"
#include "math.h"

#pragma DATA_SECTION(lea1, ".leaRAM")
#pragma DATA_SECTION(lea2, ".leaRAM")
#pragma DATA_SECTION(leadest, ".leaRAM")

DSPLIB_DATA(lea1, 4)
uint16_t lea1[2][2] = {{7, 2}, {1, 2}};
DSPLIB_DATA(lea2, 4)
uint16_t lea2[2][2] = {{4, 5}, {2,3}};
DSPLIB_DATA(leadest, 4)
uint16_t leadest[2][2];


volatile uint32_t cycleCount = 0;
int main()
{
        msp_status status;
        msp_matrix_mpy_q15_params mpyParams;

        WDTCTL = WDTPW + WDTHOLD;

        mpyParams.srcARows = 2;
        mpyParams.srcACols = 2;
        mpyParams.srcBRows = 2;
        mpyParams.srcBCols = 2;

        status = msp_matrix_mpy_q15(&mpyParams, *lea1, *lea2, *leadest);
        cycleCount = msp_benchmarkStop(MSP_BENCHMARK_BASE);
        msp_checkStatus(status);
        return 0;

}

我不知道如何处理正确的位移——可以删除它们，也可以更改函数，使矩阵乘法的结果是标准数学计算获得的原始整数值。如果有人能帮助我解决一些可能的解决方案，我可以尝试并观察 MSP430的 bhevaior，这将非常有帮助。请告诉我是否需要任何其他信息来提供更清晰的信息。谢谢。

2 年多前

0 admin 2 年多前

TI__Guru**** 664280 points

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

固定点乘法没有什么神秘之处。结果将有与两个输入的总和相同的小数位数。因此，如果用15个小数位乘两个数字，结果将是30。因此，您必须将结果右移15位，以返回到15个小数位。

执行该偏移后，您需要先检查溢出，然后再将32位整数转换为16。想想在这种情况下该怎么办，因为这种情况几乎肯定会发生。可能是在这种成倍增长的中间积累的某个地方，你不会注意到这种积累。

例如，代码的以下部分：

            /* Loop through all elements in srcA column and srcB row. */
            while(cntr < srcACols) {
                MACS = srcA[row_offset + cntr];
                OP2 = srcB[col_offset + dst_col];
                col_offset += srcBCols;
                cntr++;
            }
            
            /* Store the result */
            dst[dst_row_offset + dst_col] = RESHI * 32768 + RESLO;

这会试图(严重)将32位值填入16位孔。甚至不移动二进制点，这意味着您感兴趣的几乎所有位都被抛出。

0 admin 2 年多前

TI__Guru**** 664280 points

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

感谢对定点乘法的澄清。但是，我一直在矩阵乘法中使用整数值，我不确定如何通过修改定义执行矩阵乘法函数的“msp_matrix 颠簸 Q15.c”文件来抵消这种变化。我不确定是否要改变结果，因为多个值在右移15位时会得出相同的结果，而对所有整数结果应用右移15位可能不会返回原始值。正因为如此，我想尝试修改乘法函数本身，以便从第一次调用此函数时就可以否定该位移。但是，我提到的链接只解释了我们不使用 LEA 进行计算的情况，因为我的矩阵很大，我想使用 LEA 进行计算。在这种情况下，更改乘法可能有什么想法，因为我只处理整数值，而不处理小数位的值？

0 admin 2 年多前

TI__Guru**** 664280 points

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

我从未使用过 LEA，但在查看命令参考(sla850)后，它只支持定点类型。我能看到的最接近您所需要的是 LEACMD_MAC，它接受 Q15作为输入，Q31作为输出。

两个 Q15相乘的自然结果是 Q30。因此，你必须将 Q31结果调整为正确的一位。

0 admin 2 年多前

TI__Guru**** 664280 points

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

好的，我将看一下命令引用，还会检查使用此命令获得的内容。如果我还有其他问题，我将为这些问题创建一个新的问题。我将把这个问题标记为已解决。感谢您的回复。

MSP 低功耗微控制器（参考译文帖）

MSP 低功耗微控制器（参考译文帖）(Read Only)

[参考译文] MSP430FR5994：MSP430上的矩阵乘法，带和不带 LEA