我正在尝试为 MSP430FR5994上的多个基质实施矩阵乘法。 我在论坛上提出了几个旧问题后,使用了上面提到的答案来编写我的实施代码。 其目的是复制神经网络层,因此,计算涉及输入矩阵的矩阵乘以另一个包含网络权重的矩阵,然后添加另一个包含神经网络偏置值的矩阵。 在执行这些操作时,我意识到需要对这些值进行量化,并在将输入,权重或偏差填入矩阵之前进行量化。 我目前遇到的问题是矩阵计算结果在存储到结果矩阵之前被右移1位15次。 我知道,这种行为与“_Q15”参数的处理方式一致,同时也查看了执行此转换的代码。 以下问题 中提供了一个消除这种转变的可行解决方案- https://e2e.ti.com/support/microcontrollers/msp-low-power-microcontrollers-group/msp430/f/msp-low-power-microcontroller-forum/716353/msp430fr5992-msp-dsplib-msp_matrix_mpy_q15 -但是,此处未提及使用 MSP LEA 的解决方案。 我尝试了一些改变乘法函数的方法,它会使用 Int16/t/uint16_t 值而不是_Q15参数。 修改后的矩阵乘法函数-包含上述问题中提到的更改-如下所示:
/* --COPYRIGHT--,BSD
* Copyright (c) 2016, Texas Instruments Incorporated
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* * Neither the name of Texas Instruments Incorporated nor the names of
* its contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
* THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
* PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
* CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
* EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
* PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
* OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
* WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
* OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
* EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
* --/COPYRIGHT--*/
#include "../../include/DSPLib.h"
#if defined(MSP_USE_LEA)
msp_status msp_matrix_mpy_q15(const msp_matrix_mpy_q15_params *params, const uint16_t *srcA, const uint16_t *srcB, uint16_t *dst)
{
uint16_t srcARows;
uint16_t srcACols;
uint16_t srcBRows;
uint16_t srcBCols;
msp_status status;
MSP_LEA_MPYMATRIXROW_PARAMS *leaParams;
/* Initialize the row and column sizes. */
srcARows = params->srcARows;
srcACols = params->srcACols;
srcBRows = params->srcBRows;
srcBCols = params->srcBCols;
#ifndef MSP_DISABLE_DIAGNOSTICS
/* Check that column of A equals rows of B */
if (srcACols != srcBRows) {
return MSP_SIZE_ERROR;
}
/* Check that the data arrays are aligned and in a valid memory segment. */
if (!(MSP_LEA_VALID_ADDRESS(srcA, 4) &
MSP_LEA_VALID_ADDRESS(srcB, 4) &
MSP_LEA_VALID_ADDRESS(dst, 4))) {
return MSP_LEA_INVALID_ADDRESS;
}
/* Acquire lock for LEA module. */
if (!msp_lea_acquireLock()) {
return MSP_LEA_BUSY;
}
#endif //MSP_DISABLE_DIAGNOSTICS
/* Initialize LEA if it is not enabled. */
if (!(LEAPMCTL & LEACMDEN)) {
msp_lea_init();
}
/* Allocate MSP_LEA_MPYMATRIXROW_PARAMS structure. */
leaParams = (MSP_LEA_MPYMATRIXROW_PARAMS *)msp_lea_allocMemory(sizeof(MSP_LEA_MPYMATRIXROW_PARAMS)/sizeof(uint32_t));
/* Set status flag. */
status = MSP_SUCCESS;
/* Iterate through each row of srcA */
while (srcARows--) {
/* Set MSP_LEA_MPYMATRIXROW_PARAMS structure. */
leaParams->rowSize = srcBRows;
leaParams->colSize = srcBCols;
leaParams->colVector = MSP_LEA_CONVERT_ADDRESS(srcB);
leaParams->output = MSP_LEA_CONVERT_ADDRESS(dst);
/* Load source arguments to LEA. */
LEAPMS0 = MSP_LEA_CONVERT_ADDRESS(srcA);
LEAPMS1 = MSP_LEA_CONVERT_ADDRESS(leaParams);
/* Invoke the LEACMD__MPYMATRIXROW command with interrupts enabled. */
LEAPMCB = LEACMD__MPYMATRIXROW | LEAITFLG1;
/* Clear DSPLib flags, restore interrupts and enter LPM0. */
msp_lea_ifg = 0;
msp_lea_enterLPM();
#ifndef MSP_DISABLE_DIAGNOSTICS
/* Check LEA interrupt flags for any errors. */
if (msp_lea_ifg & LEACOVLIFG) {
status = MSP_LEA_COMMAND_OVERFLOW;
break;
}
else if (msp_lea_ifg & LEAOORIFG) {
status = MSP_LEA_OUT_OF_RANGE;
break;
}
else if (msp_lea_ifg & LEASDIIFG) {
status = MSP_LEA_SCALAR_INCONSISTENCY;
break;
}
#endif //MSP_DISABLE_DIAGNOSTICS
/* Increment srcA and dst pointers. */
srcA += srcACols;
dst += srcBCols;
}
/* Free MSP_LEA_MPYMATRIXROW_PARAMS structure. */
msp_lea_freeMemory(sizeof(MSP_LEA_MPYMATRIXROW_PARAMS)/sizeof(uint32_t));
/* Free lock for LEA module and return status. */
msp_lea_freeLock();
return status;
}
#else //MSP_USE_LEA
msp_status msp_matrix_mpy_q15(const msp_matrix_mpy_q15_params *params, const uint16_t *srcA, const uint16_t *srcB, uint16_t *dst)
{
uint16_t cntr;
uint16_t srcARows;
uint16_t srcACols;
uint16_t srcBRows;
uint16_t srcBCols;
uint16_t dst_row;
uint16_t dst_col;
uint16_t row_offset;
uint16_t col_offset;
uint16_t dst_row_offset;
/* Initialize the row and column sizes. */
srcARows = params->srcARows;
srcACols = params->srcACols;
srcBRows = params->srcBRows;
srcBCols = params->srcBCols;
#ifndef MSP_DISABLE_DIAGNOSTICS
/* Check that column of A equals rows of B */
if (srcACols != srcBRows) {
return MSP_SIZE_ERROR;
}
#endif //MSP_DISABLE_DIAGNOSTICS
/* In initialize loop counters. */
cntr = 0;
dst_row = 0;
dst_col = 0;
row_offset = 0;
col_offset = 0;
dst_row_offset = 0;
#if defined(__MSP430_HAS_MPY32__)
/* If MPY32 is available save control context, set to fractional mode, set saturation mode. */
uint16_t ui16MPYState = MPY32CTL0;
MPY32CTL0 = MPYFRAC | MPYDLYWRTEN | MPYSAT;
/* Loop through all srcA rows. */
while(srcARows--) {
/* Loop through all srcB columns. */
while (dst_col < srcBCols) {
/* Reset result accumulator. */
MPY32CTL0 &= ~MPYC;
RESLO = 0; RESHI = 0;
/* Loop through all elements in srcA column and srcB row. */
while(cntr < srcACols) {
MACS = srcA[row_offset + cntr];
OP2 = srcB[col_offset + dst_col];
col_offset += srcBCols;
cntr++;
}
/* Store the result */
dst[dst_row_offset + dst_col] = RESHI * 32768 + RESLO;
/* Update pointers. */
dst_col++;
cntr = 0;
col_offset = 0;
}
/* Update pointers. */
dst_row++;
dst_col = 0;
row_offset += srcACols;
dst_row_offset += srcBCols;
}
/* Restore MPY32 control context, previous saturation state. */
MPY32CTL0 = ui16MPYState;
#else //__MSP430_HAS_MPY32__
uint32_t result;
/* Loop through all srcA rows. */
while(srcARows--) {
/* Loop through all srcB columns. */
while (dst_col < srcBCols) {
/* Initialize accumulator. */
result = 0;
/* Loop through all elements in srcA column and srcB row. */
while(cntr < srcACols) {
result += (int32_t)srcA[row_offset + cntr] * (int32_t)srcB[col_offset + dst_col];
col_offset += srcBCols;
cntr++;
}
/* Saturate and store the result */
dst[dst_row_offset + dst_col] = (int32_t)__saturate(result, INT32_MIN, INT32_MAX);
/* Update pointers. */
dst_col++;
cntr = 0;
col_offset = 0;
}
/* Update pointers. */
dst_row++;
dst_col = 0;
row_offset += srcACols;
dst_row_offset += srcBCols;
}
#endif //__MSP430_HAS_MPY32__
return MSP_SUCCESS;
}
#endif //MSP_USE_LEA
尽管将矩阵的输入类型更改为'uint16_t' ,并通过消除移动15修改结果的存储方式,但代码仍然无法正确计算整数格式的矩阵值。 矩阵乘法的完整代码如下:
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <msp430.h>
#include "DSPLib.h"
#include "math.h"
#pragma DATA_SECTION(lea1, ".leaRAM")
#pragma DATA_SECTION(lea2, ".leaRAM")
#pragma DATA_SECTION(leadest, ".leaRAM")
DSPLIB_DATA(lea1, 4)
uint16_t lea1[2][2] = {{7, 2}, {1, 2}};
DSPLIB_DATA(lea2, 4)
uint16_t lea2[2][2] = {{4, 5}, {2,3}};
DSPLIB_DATA(leadest, 4)
uint16_t leadest[2][2];
volatile uint32_t cycleCount = 0;
int main()
{
msp_status status;
msp_matrix_mpy_q15_params mpyParams;
WDTCTL = WDTPW + WDTHOLD;
mpyParams.srcARows = 2;
mpyParams.srcACols = 2;
mpyParams.srcBRows = 2;
mpyParams.srcBCols = 2;
status = msp_matrix_mpy_q15(&mpyParams, *lea1, *lea2, *leadest);
cycleCount = msp_benchmarkStop(MSP_BENCHMARK_BASE);
msp_checkStatus(status);
return 0;
}
我不知道如何处理正确的位移——可以删除它们,也可以更改函数,使矩阵乘法的结果是标准数学计算获得的原始整数值。 如果有人能帮助我解决一些可能的解决方案,我可以尝试并观察 MSP430的 bhevaior,这将非常有帮助。 请告诉我是否需要任何其他信息来提供更清晰的信息。 谢谢。