[参考译文] SW-TM4C：用于确定程序为何最终出现在FaultISR中的简化方法

admin

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

https://e2e.ti.com/support/microcontrollers/arm-based-microcontrollers-group/arm-based-microcontrollers/f/arm-based-microcontrollers-forum/1099079/sw-tm4c-simplified-method-for-determining-why-a-program-ends-up-in-faultisr

零件号：SW-TM4C

我开始使用TivaWare时最沮丧的事情之一(当时的StellarisWare) 在FaultISR()中结束时，我在尝试使用外围设备或其它类似问题之前未启用外围设备，而是不得不单步浏览我的源代码以找到它出错的位置。标准建议(1)似乎是解密 NVIC_FAULTSTAT和 NVIC_FAULTDDR寄存器中的值，然后手动解码中断堆栈以确定导致故障的指令之后的指令地址(2)。我已经这样做了，但99 % 当时使用修改的FaultISR()，我可以用更少的工作来发现问题，如下所示：

static void
FaultISR(void)
{
	volatile int i = 1;
    while(i)
    {
    }
}

使用此实现，当您暂停调试器并发现您在FaultISR()中时，通常可以如下所示找到原因：

使用调试器的变量视图将I的值更改为0，这样它将退出循环。
单击C步入按钮(通常两次)，直到调用栈显示从底部变为第二个显示main()。
单击调用栈中从顶部开始的第二个项目。它应显示您的源代码。
查看指示前的说明。可能是导致故障的原因。

我建议TI将FaultISR()的默认实现更改为类似的方式。我使用的版本中也有许多注释；我将在下面添加(3)。

更好的是，也许可以更改调试器，以便它能够从中断服务例程中解码堆栈跟踪。我意识到通过输入ISR推入的堆栈帧与函数的堆栈帧不同，因此需要有一些方法来解密要使用的解码方法。如果在一般情况下不能自动执行此操作，我可以设想一种方法，让用户提供一个提示(可能使用复选框)，即特定的堆栈帧是用于ISR的。也许调试器可以维护一个符号名称的列表，这些符号名称都是以这种方式标记的，这样它就知道下次也会将它们视为ISR。该列表可能默认包含已知ISR，如FaultISR()，NmiISR()和ResetISR()，也可能使用g_pfnVectors []中的条目进行初始化。这样，新用户就可以立即看到他们的错误所在；我的猜测是这样做可以消除本论坛中有关FaultISR()的许多问题。即使存在嵌套ISR (处理低优先级中断时发生的高优先级中断)和调用函数的ISR，也应该可以对堆栈进行解码。

我应该注意，如果有东西覆盖堆栈(堆栈太小，缓冲区溢出等)，调用堆栈将被损坏，这些方法都不能解决问题。下面是一些有关检测堆栈溢出的源代码注释(再次为3)。

Steve

(1)- https://e2e.ti.com/support/microcontrollers/arm-based-microcontrollers-group/arm-based-microcontrollers/f/arm-based-microcontrollers-forum/102.0822万/faq-sw-tm4c-how-to-debug-a-program-going-into-faultisr?tisearch=e2e-sitesearch&keymatch=faultisr#

(2)- SPMA043介绍如何手动解码堆栈跟踪。我将尝试在后续帖子中包括一个示例。 /cfs/file/__key/communityserver-discussions-组件-files/908/0842.spma043_2D00_Diagnosing-Software-Faults-in-Stellaris_AE00_-Microcontrollers.pdf

(3)- FaultISR()的完整版本。与我上面所说的大多数不同之处是评论，有些不适用于其他人。可以添加注释，说明未初始化的外围设备是导致FaultISR()终止的常见原因。

//*****************************************************************************
//
// This is the code that gets called when the processor receives a fault
// interrupt.  It prints any queued debug messages (including tracepoints)
// then enters an infinite loop, preserving the system state for examination
// by a debugger.
//
// If have trouble figuring out why we get here, check to see if there is a
// way to see which vector was used.  Could also make multiple ISRs and make
// each suspect vector point to a different one.  Update: it looks like FaultISR(),
// unlike IntDefaultHandler(), is pointed to by only one vector.  Might still
// be able to look at register values and determine what triggered the "hard fault".
// - Also consider setting a global to different values at various places in the
//   code so can check its value here and see which of those places it was last set.
//   Perhaps use a macro that sets a "checkpoint" (pointer to the filename or
//   maybe function name and an int to the line number).  Search for
//   "ktowyawesctlcic".  Todo.
//
//*****************************************************************************
static void
FaultISR(void)
{
	// Print messages before going into infinite loop.  If the COP watchdog is
	// enabled, this printing probably would have happened in a bit anyway when
	// watchdogTimeoutISR() got called.
	extern volatile uint32_t sysTickMillisecondCount;												// defined in sysTick2.c
	blockWhilePrintAllDebugMessages( sysTickMillisecondCount, "FaultISR" );

    //
    // Enter an infinite loop.
    //
	// There are a couple of ways to (sometimes) determine which code was running
	// when the fault occurred:
	// - Exit this loop and single-step out of this ISR to the calling code.
	//   To trace back out of this ISR, use debugger to change the value
	//   of i to 0, then click the "Assembly Step Into" button several times
	//   (usually 4 to 6) or try the C step-into button.
	// - See "Debugging - tracing how got into ISR.docx" for how to inspect
	//   the stack by hand and modify the PC to make the debugger show the calling
	//   code.
	//
	// If unable to trace back out, it may be because the stack got hammered.
	// - It could be that the stack is too small and is overflowing.
	//   - There is a process for checking the available stack space documented
	//     in the firmware release procedure.
	//   - You can set a watchpoint on __stack as described in
	//     processors.wiki.ti.com/.../Watchpoints_for_Stellaris_in_CCS
	//     to get the debugger to stop at the point the stack overflows.
	// - It could be that the stack frame is being overwritten even if the stack
	//   itself is not overflowing.
	//   - Can hammer the stack frame without overflowing the stack.  For
	//     example, could have a local array on the stack and overflow it.
	// - The stack pointer itself could get changed to an invalid location.
	//
	// Other possible ways to get clues about the cause:
	// - Look at contents of stack for clues (like strings).
	// - Enable IF_DEBUG_LOCKUPS_USING_TRACEPOINTS_MAIN_LOOP and similar code
	//   to help track down where the buffer overflow is occurring.  Search for
	//   "ktowyawesctlcic".
	// - Acquire a "reverse debugger" (perhaps using debug hardware with trace)
	//   so can look back to the point it all went wrong.
    //
	volatile int i = 1;
    while(i)
    {
    }
}

3 年多前

0 admin 3 年多前

TI__Guru**** 2551110 points

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

如果您尝试按SPMA043中所述手动解码堆栈跟踪，以下链接可能对您有所帮助。它展示了我是如何通过屏幕截图完成的。我通过将函数地址插入反汇编窗口来利用调试器。我还发现在反汇编窗口中的某一行上设置断点很有帮助，这样我就可以找到相应的C源文件和行号。但是我很久没有这么做了，因为使用修改后的FaultISR()容易得多，而且通常给我提供相同的信息。

https://docs.google.com/document/d/1XxnSUmKLSfFTGPlnjlHSKORdhp54uZrN/edit?usp=sharing&ouid=107025470996951424497&rtpof=true&sd=true

我将尝试让所有人都可以编辑该文档，以防有人想要改进它。如果垃圾邮件填满，我会将其还原并锁定。

Steve

0 admin 3 年多前

TI__Guru**** 2551110 points

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

您好，Steve，

感谢您提供有关如何调试TM4C MCU的提示。大多数情况下，故障是由访问未启用的外围设备或缺少堆栈空间引起的。正如您所指出的，当堆栈溢出时，由于内存已损坏，您的方法可能无法运行。尽管如此，我还是会将此帖子添加为书签，以便在其他人偶然发现FaultISR时可以参考。再次感谢。

0 admin 3 年多前

TI__Guru**** 2551110 points

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

更好一点，或许可以在调试程序堆栈内更改调试器跟踪，从而使其能够在例行程序中进行解码。 [/引述]
CCS/TM4C1294KCPDT：如何在异常处理程序中获取堆栈解开？包含一个使用Gel脚本使调试器解压的示例，以显示发生硬故障时的调用栈。

是使用CCS 9.1 开发的，但不记得是否与更高版本的CCS一起使用过。

0 admin 3 年多前

TI__Guru**** 2551110 points

请注意，本文内容源自机器翻译，可能存在语法或其它翻译错误，仅供参考。如需获取准确内容，请参阅链接中的英语原文或自行翻译。

感谢您的链接。 Gel脚本可以从异常处理程序内部对栈进行解组，这一事实本身就很有用，并表明它可以内置到调试器中，并且可以开箱即用。

我注意到 Peter Jaquiery的定制FaultISR()和我的一样，对导致FaultISR()出现的可能原因有很多注释。

Peter在其FaultISR()中的无限循环之前使用“__ASM(" BKPT #2");”。我尝试使用 BKPT代替我的循环，如下所示：

volatile int i = 1;

++i;        // If execution stops here, increment the value in the PC register by 2 (remember
            // it is in hexadecimal, so 0x1238 becomes 0x123A).  Then click the C single step
            // button and the call stack that led to getting here should show up (if the stack
            // isn't hammered).

// When execution gets here, the debugger shows it being on the line above.
// If the debugger is not connected, it hangs.  If the watchdog is enabled, it
// causes an automatic reset.
__asm("    BKPT #2");

我不确定这是否更好。它可以方便地停止调试器，因此您不必注意程序已挂起，然后单击调试器的暂停按钮。另一方面，在PC中添加两个可能比将i更改为0更困难(至少如果您对十六进制感到不舒服)。无论采用哪种方法，与缺省实现FaultISR()相比，访问可用的调用栈要容易得多。

这样的问题多年来一直使新加入者的事情变得困难。我们中的一些人已经过了陡峭的学习曲线(尽管有些问题仍然会耗费我们时间)，但我建议新手从Arduino或Raspberry Pi开始，因为这样的事情。 TI，请让您更轻松地推荐您的产品！在某个论坛上张贴书签并不能替代让事情正常工作(无需查找和学习使用Gel脚本即可堆叠展开)，构建查找问题的工具(像Peter和我一样修改FaultISR中的代码)，或者具有调试器停止时显示的相关注释(导致FaultISR终止的常见原因)。

Steve

基于 Arm 的微控制器（参考译文帖）

基于 Arm 的微控制器（参考译文帖）(Read Only)

[参考译文] SW-TM4C：用于确定程序为何最终出现在FaultISR中的简化方法