Debugging Embedded Devices at Scale: Effective Techniques for Diagnosis and Resolution
In this presentation, I’ll walk through effective approaches for detecting, diagnosing, debugging, and resolving issues in embedded firmware and devices deployed at a large scale, such as in populations of hundreds of thousands or millions. While much has been written on monitoring smaller fleets, typical strategies like onsite debugging, debugger-based diagnosis, and manual log analysis can fail when dealing with massive populations of devices.
The presentation focuses on low-level debugging techniques like fault handling and exception parsing on the device, as well as a novel approach to capturing core dumps on Cortex-M MCUs. The presentation provides guidance on collecting core dumps and diagnosing crashes and faults without a debugger. I’ll also explore the changes in developer behavior that can be implemented when core dump functionality is integrated into firmware, and dumps are received centrally.
This presentation includes content that is often overlooked by online resources, which assume that firmware works flawlessly and bugs are not introduced by developers. We know this is not the case and have developed these strategies to offer real-world solutions for new and experienced firmware engineers who are eager to tackle the challenges of debugging at scale.