我正在尝试为 MSP430FR5994上的多个基质实施矩阵乘法。 我在论坛上提出了几个旧问题后,使用了上面提到的答案来编写我的实施代码。 其目的是复制神经网络层,因此,计算涉及输入矩阵的矩阵乘以另一个包含网络权重的矩阵,然后添加另一个包含神经网络偏置值的矩阵。 在执行这些操作时,我意识到需要对这些值进行量化,并在将输入,权重或偏差填入矩阵之前进行量化。 我目前遇到的问题是矩阵计算结果在存储到结果矩阵之前被右移1位15次。 我知道,这种行为与“_Q15”参数的处理方式一致,同时也查看了执行此转换的代码。 以下问题 中提供了一个消除这种转变的可行解决方案- https://e2e.ti.com/support/microcontrollers/msp-low-power-microcontrollers-group/msp430/f/msp-low-power-microcontroller-forum/716353/msp430fr5992-msp-dsplib-msp_matrix_mpy_q15 -但是,此处未提及使用 MSP LEA 的解决方案。 我尝试了一些改变乘法函数的方法,它会使用 Int16/t/uint16_t 值而不是_Q15参数。 修改后的矩阵乘法函数-包含上述问题中提到的更改-如下所示:
/* --COPYRIGHT--,BSD * Copyright (c) 2016, Texas Instruments Incorporated * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * * Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * * Neither the name of Texas Instruments Incorporated nor the names of * its contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * --/COPYRIGHT--*/ #include "../../include/DSPLib.h" #if defined(MSP_USE_LEA) msp_status msp_matrix_mpy_q15(const msp_matrix_mpy_q15_params *params, const uint16_t *srcA, const uint16_t *srcB, uint16_t *dst) { uint16_t srcARows; uint16_t srcACols; uint16_t srcBRows; uint16_t srcBCols; msp_status status; MSP_LEA_MPYMATRIXROW_PARAMS *leaParams; /* Initialize the row and column sizes. */ srcARows = params->srcARows; srcACols = params->srcACols; srcBRows = params->srcBRows; srcBCols = params->srcBCols; #ifndef MSP_DISABLE_DIAGNOSTICS /* Check that column of A equals rows of B */ if (srcACols != srcBRows) { return MSP_SIZE_ERROR; } /* Check that the data arrays are aligned and in a valid memory segment. */ if (!(MSP_LEA_VALID_ADDRESS(srcA, 4) & MSP_LEA_VALID_ADDRESS(srcB, 4) & MSP_LEA_VALID_ADDRESS(dst, 4))) { return MSP_LEA_INVALID_ADDRESS; } /* Acquire lock for LEA module. */ if (!msp_lea_acquireLock()) { return MSP_LEA_BUSY; } #endif //MSP_DISABLE_DIAGNOSTICS /* Initialize LEA if it is not enabled. */ if (!(LEAPMCTL & LEACMDEN)) { msp_lea_init(); } /* Allocate MSP_LEA_MPYMATRIXROW_PARAMS structure. */ leaParams = (MSP_LEA_MPYMATRIXROW_PARAMS *)msp_lea_allocMemory(sizeof(MSP_LEA_MPYMATRIXROW_PARAMS)/sizeof(uint32_t)); /* Set status flag. */ status = MSP_SUCCESS; /* Iterate through each row of srcA */ while (srcARows--) { /* Set MSP_LEA_MPYMATRIXROW_PARAMS structure. */ leaParams->rowSize = srcBRows; leaParams->colSize = srcBCols; leaParams->colVector = MSP_LEA_CONVERT_ADDRESS(srcB); leaParams->output = MSP_LEA_CONVERT_ADDRESS(dst); /* Load source arguments to LEA. */ LEAPMS0 = MSP_LEA_CONVERT_ADDRESS(srcA); LEAPMS1 = MSP_LEA_CONVERT_ADDRESS(leaParams); /* Invoke the LEACMD__MPYMATRIXROW command with interrupts enabled. */ LEAPMCB = LEACMD__MPYMATRIXROW | LEAITFLG1; /* Clear DSPLib flags, restore interrupts and enter LPM0. */ msp_lea_ifg = 0; msp_lea_enterLPM(); #ifndef MSP_DISABLE_DIAGNOSTICS /* Check LEA interrupt flags for any errors. */ if (msp_lea_ifg & LEACOVLIFG) { status = MSP_LEA_COMMAND_OVERFLOW; break; } else if (msp_lea_ifg & LEAOORIFG) { status = MSP_LEA_OUT_OF_RANGE; break; } else if (msp_lea_ifg & LEASDIIFG) { status = MSP_LEA_SCALAR_INCONSISTENCY; break; } #endif //MSP_DISABLE_DIAGNOSTICS /* Increment srcA and dst pointers. */ srcA += srcACols; dst += srcBCols; } /* Free MSP_LEA_MPYMATRIXROW_PARAMS structure. */ msp_lea_freeMemory(sizeof(MSP_LEA_MPYMATRIXROW_PARAMS)/sizeof(uint32_t)); /* Free lock for LEA module and return status. */ msp_lea_freeLock(); return status; } #else //MSP_USE_LEA msp_status msp_matrix_mpy_q15(const msp_matrix_mpy_q15_params *params, const uint16_t *srcA, const uint16_t *srcB, uint16_t *dst) { uint16_t cntr; uint16_t srcARows; uint16_t srcACols; uint16_t srcBRows; uint16_t srcBCols; uint16_t dst_row; uint16_t dst_col; uint16_t row_offset; uint16_t col_offset; uint16_t dst_row_offset; /* Initialize the row and column sizes. */ srcARows = params->srcARows; srcACols = params->srcACols; srcBRows = params->srcBRows; srcBCols = params->srcBCols; #ifndef MSP_DISABLE_DIAGNOSTICS /* Check that column of A equals rows of B */ if (srcACols != srcBRows) { return MSP_SIZE_ERROR; } #endif //MSP_DISABLE_DIAGNOSTICS /* In initialize loop counters. */ cntr = 0; dst_row = 0; dst_col = 0; row_offset = 0; col_offset = 0; dst_row_offset = 0; #if defined(__MSP430_HAS_MPY32__) /* If MPY32 is available save control context, set to fractional mode, set saturation mode. */ uint16_t ui16MPYState = MPY32CTL0; MPY32CTL0 = MPYFRAC | MPYDLYWRTEN | MPYSAT; /* Loop through all srcA rows. */ while(srcARows--) { /* Loop through all srcB columns. */ while (dst_col < srcBCols) { /* Reset result accumulator. */ MPY32CTL0 &= ~MPYC; RESLO = 0; RESHI = 0; /* Loop through all elements in srcA column and srcB row. */ while(cntr < srcACols) { MACS = srcA[row_offset + cntr]; OP2 = srcB[col_offset + dst_col]; col_offset += srcBCols; cntr++; } /* Store the result */ dst[dst_row_offset + dst_col] = RESHI * 32768 + RESLO; /* Update pointers. */ dst_col++; cntr = 0; col_offset = 0; } /* Update pointers. */ dst_row++; dst_col = 0; row_offset += srcACols; dst_row_offset += srcBCols; } /* Restore MPY32 control context, previous saturation state. */ MPY32CTL0 = ui16MPYState; #else //__MSP430_HAS_MPY32__ uint32_t result; /* Loop through all srcA rows. */ while(srcARows--) { /* Loop through all srcB columns. */ while (dst_col < srcBCols) { /* Initialize accumulator. */ result = 0; /* Loop through all elements in srcA column and srcB row. */ while(cntr < srcACols) { result += (int32_t)srcA[row_offset + cntr] * (int32_t)srcB[col_offset + dst_col]; col_offset += srcBCols; cntr++; } /* Saturate and store the result */ dst[dst_row_offset + dst_col] = (int32_t)__saturate(result, INT32_MIN, INT32_MAX); /* Update pointers. */ dst_col++; cntr = 0; col_offset = 0; } /* Update pointers. */ dst_row++; dst_col = 0; row_offset += srcACols; dst_row_offset += srcBCols; } #endif //__MSP430_HAS_MPY32__ return MSP_SUCCESS; } #endif //MSP_USE_LEA
尽管将矩阵的输入类型更改为'uint16_t' ,并通过消除移动15修改结果的存储方式,但代码仍然无法正确计算整数格式的矩阵值。 矩阵乘法的完整代码如下:
#include <stdint.h> #include <stdlib.h> #include <stdio.h> #include <assert.h> #include <msp430.h> #include "DSPLib.h" #include "math.h" #pragma DATA_SECTION(lea1, ".leaRAM") #pragma DATA_SECTION(lea2, ".leaRAM") #pragma DATA_SECTION(leadest, ".leaRAM") DSPLIB_DATA(lea1, 4) uint16_t lea1[2][2] = {{7, 2}, {1, 2}}; DSPLIB_DATA(lea2, 4) uint16_t lea2[2][2] = {{4, 5}, {2,3}}; DSPLIB_DATA(leadest, 4) uint16_t leadest[2][2]; volatile uint32_t cycleCount = 0; int main() { msp_status status; msp_matrix_mpy_q15_params mpyParams; WDTCTL = WDTPW + WDTHOLD; mpyParams.srcARows = 2; mpyParams.srcACols = 2; mpyParams.srcBRows = 2; mpyParams.srcBCols = 2; status = msp_matrix_mpy_q15(&mpyParams, *lea1, *lea2, *leadest); cycleCount = msp_benchmarkStop(MSP_BENCHMARK_BASE); msp_checkStatus(status); return 0; }
我不知道如何处理正确的位移——可以删除它们,也可以更改函数,使矩阵乘法的结果是标准数学计算获得的原始整数值。 如果有人能帮助我解决一些可能的解决方案,我可以尝试并观察 MSP430的 bhevaior,这将非常有帮助。 请告诉我是否需要任何其他信息来提供更清晰的信息。 谢谢。