Debugging & Testing Embedded Software Through Instrumentation
Instrumentation-based debugging is a highly effective methodology, though it seems underutilized within the embedded systems community. In this presentation, I will introduce Scrutiny, an open-source software suite developed as a personal initiative, designed to streamline debugging, telemetry, and hardware-in-the-loop (HIL) testing for embedded devices.
What this presentation is about and why it matters
How do you debug embedded software when a probe is costly, intrusive, slow, or not available at all? Pier-Yves Lessard approaches that tension with a practical tour of debugging by instrumentation, using Scrutiny Debugger as the anchor. The session mixes architecture, a live STM32F411 Discovery board demo, and concrete integration details from a C firmware. It also shows how the same setup can support scripts, a GUI, and embedded graph capture while being independent from the silicon vendor environment. This is useful for embedded engineers who need better observability on real hardware, especially when classic stepping and halt-based debugging are a poor fit.
Who will benefit the most from this presentation
- Embedded firmware engineers who need visibility into a running system without relying on a debug probe
- Developers working on real-time control software, especially when pauses or stepping would disturb behavior
- Test engineers who want to automate observations and capture data from live devices
- Teams using baremetal MCUs who need a lightweight way to expose internal state
- Engineers evaluating instrumentation-based debugging for production-like hardware setups
What you need to know
You will get more from the session if you are comfortable with:
- Basic embedded firmware concepts, including main loops, tasks, and peripheral I/O
- Debugging workflows on microcontrollers, especially probe-based inspection
- Reading simple C firmware and build-system changes
- Common embedded transports such as serial, USB virtual serial, CAN, or Ethernet
Glossary (terms used in this talk)
- Handshake: A coordination mechanism used by digital interfaces to transfer data when both producer and consumer are ready. In streaming hardware, handshakes often use valid and ready signals to manage flow control.
- Ring buffer: A circular buffer used to pass streaming data efficiently between producer and consumer stages.
- Instrumentation-based debugging: A debugging approach that exposes internal state through firmware-controlled data exchange over an existing device interface. It trades direct hardware inspection for runtime observability that depends on code running on the target.
- Firmware description file (SFD): A metadata file that describes a firmware image, including memory layout and symbol information needed to locate variables on a connected device. Tooling can use it to map runtime data back to meaningful names and addresses.
- Alias: A virtual variable name that points to a real variable and can optionally apply transformations such as scaling or offset. Aliases provide a stable, human-friendly path for external tools and scripts.
- Half-duplex: A communication mode where data flows in only one direction at a time on a shared channel. The direction switches as needed, but simultaneous two-way transmission is not possible.
- Embedded graph: A device-side capture mechanism where the firmware samples selected variables into an in-memory ring buffer and, upon a software trigger, uploads the buffered acquisition to the server for analysis.
- Firmware ID: A unique identifier embedded in the firmware and associated SFD that allows the server to match a connected device to its firmware description and metadata.
Toolbox (mentioned in this talk)
- GitHub: A web-based platform for hosting Git repositories and collaborating on software development.
- CMake: A cross-platform build system generator that helps define and drive builds across different compilers, IDEs, and platforms.
- STM32: A family of microcontrollers commonly used for embedded control and hardware interfacing. They are often used to drive peripherals such as LEDs, motors, and sensors.
- Scrutiny Debugger: An open source instrumentation and test framework for embedded systems. It provides a server, client tools, and firmware-side support for reading variables and capturing runtime data.
- STM32F411 Discovery: An STM32 development board used for embedded prototyping and demonstration. It provides a convenient platform for testing firmware, peripherals, and instrumentation workflows on real hardware.
- Arduino Mega: A microcontroller development board often used for prototyping and small embedded experiments. It is commonly paired with sensors and simple serial-based tooling.
Final thoughts
Practical and demo-driven, this session gives you a working mental model for turning firmware into something you can observe, script against, and test more confidently. The value is less about abstract theory and more about seeing how an embedded tooling stack fits together from target to server to client. It will help firmware developers, test engineers, and anyone wrestling with hard-to-reproduce behavior on live hardware. The tone stays grounded throughout, with an engineer’s bias toward things that can actually be built and used.
This overview is AI-generated from the session transcript. Spot an issue? Let us know.
Hi Nathan, following up on the display feature. I just released the first version of it (called Human Machine Interface) in v0.13.0
Cheers!
Hi Nathan,
Thank you for the feedback!
It indeed looks a lot like Uc/Probe. There seems to exist lot of proprietary/homemade equivalents, but none fully open source and with this architecture, that's why I did this project.
As for the sampling rate, it's not quite right.
-
The embedded datalogger has no restriction on the sampling rate. It goes as fast as your scheduler task.
-
The continuous stream of reading rate greatly depends on your communication channel, buffer size you allocate to transmission and the priority you give to Scrutiny in your firmware. Since the communication uses a ping/pong protocol, the greatest bottleneck is often latency in the host driver. Worst I saw was Infineon virtual serial port: 32ms delay between reception and software notification.
Real time readings normally range somewhere in the 30Hz-500Hz sampling rate. Number of variable depending on the buffer sizes (e.g. how many variable you can fit in a single memory dump command). This is normally fine as the goal is to update a UI in real time.
We can do some math. A dump response has a protocol overhead of 9 bytes. Each dump contains an address and the data. Say you are on a 32bits MCU, and you try to dumps non-contiguous float values and you allocate a 256 bytes buffer to transmission, the you can fit : floor((256-9) / (4+4)) = 30 values per ping/pong message. If the values are contiguous, then this gets higher.
As for the custom UI. The answer is yes, and it's almost finished! The first version of it will be part of next release (0.13), aimed for early June 2026. See the Github issue, I'll post some screenshot for you. : https://github.com/scrutinydebugger/scrutiny-main/issues/323
This was a great presentation of an excellent tool! It looks very fully featured, and the level of integration you’ve created across multiple domains — firmware, build tools, server, and gui monitoring and interface — is truly impressive! Thank you for the presentation!
Impressive presentation and demo, many thanks! What I'd like to know is how much overhead the library on the target device will produce (CPU usage, flash, RAM)? Also, would it run on a board that has an Embedded Linux (Debian in our case) on it?
Hello, for the flash usage, on a Cortex-R52 core (arm32), the embedded library takes about 30kB with all features enabled. The data logging feature being the heaviest. Cpu usage is very low as the library mostly do memcpy. Don't forget that you control the priority of execution. The idea is to put the main invocation in your idle task. RAM usage is also very low and depends mostly on the size of the buffers you allocate for communication and datalogging. You'll need 2x32 bytes to enable the communication, that's the only requirement. I do not have a stack usage metric at hand, but I expect it to stay below 100 bytes.
Yes, you can instrument an application meant to run on Linux, that's what I do to test the embedded library (you will find something called "testapp" and "ctestapp" if you dig in the GitHub repo). Make sure to build with -fno-pic to make sure your globals and statics vars have an absolute address.
To test your app quickly, run "scrutiny elf2varmap yourbin.elf" and check the output. If you see variables of interest in the output, you will have access to them once the integration is compete.
Thank you for sharing and making this project open source! This was a very well put together demo. I am excited to try out this SW on some legacy HW that does not have a JTAG debugger port.
Thank you for the feedback! Glad it can be useful to you.
Great presentation and very cool project! Have you considered using ScrutinyDebugger for automated testing? By your instrumentation, you could inject stimuli and capture actions under the control of a testing script.
Absolutely!
The presentation is called "Debugging & Testing Embedded Software Through Instrumentation". The Python SDK has been designed with HIL testing in mind. It's one of the main purpose of this project.
There are even use case examples in the SDK doc : https://scrutiny-python-sdk.readthedocs.io/en/latest/use_cases.html
Great talk, thank you. The GUI was very impressive. It looks like a useful tool that could be used for testing effectiveness of different switch debugging techniques.
Thank you!








This sounds like such a cool project! It reminds me of Micrium's old program, uC/Probe, and I've been really wanting a replacement since they were bought by SiLabs and the project kind of went defunct, I think. Is there an eventual goal to let users create displays from gauges, meters, buttons, sliders, and such as in uC/Probe? Also, it seems like the maximum sampling rate is 10 kHz, is that right? How many streaming variables are supported at that rate?