Applying Behavior Driven Development To Embedded Systems
This presentation will show how to combine Dan North’s Behavior Driven Development (BDD) with James Grenning’s Test-Driven Development (TDD) for embedded systems. Whether you work in a highly-specified regulated industry or off the back of a napkin, BDD helps build software verified to implement the desired behaviors.
The presentation will first introduce the techniques. Then it will introduce strategies for applying them in the presence of embedded system complexities, such as:
- Concurrency and synchronization between execution contexts such as interrupt and main context
- Multiple threads and multiple cores
- Event-driven finite state machines
- RTOS interactions
- Hardware interactions
The goal is to produce lean, effective, maintainable embedded system software that meets the aspirations of Edsger Dijkstra’s 1972 ACM Turing Lecture:
“Those who want really reliable software will discover that they must find means of avoiding the majority of bugs to start with, and as a result the programming process will become cheaper. If you want more effective programmers, you will discover that they should not waste their time debugging, they should not introduce the bugs to start with.”
What this presentation is about and why it matters
How do you bring behavior driven development into embedded work without turning it into a heavy testing ritual? Steve Branam tackles that tension with a practical, embedded-focused walkthrough built from long experience in bare metal, RTOS, and embedded Linux systems. He frames BDD as a development practice, then shows how it maps onto real constraints like platform dependencies, off target testing, test doubles, and behavior specifications written in given, when, then form. This is most useful if you work close to hardware and want a clearer way to structure tests, design, and feedback while building maintainable code.
Who will benefit the most from this presentation
- Embedded software engineers who are already using tests, but want a more behavior-focused workflow.
- Firmware developers working with hardware, RTOS calls, interrupts, or memory-mapped I/O.
- Tech leads or architects trying to make design more testable without over-coupling tests to internals.
- Engineers on mixed platform-independent and platform-dependent codebases who need faster feedback loops.
- Developers maintaining long-lived embedded products and looking for a better regression safety net.
What you need to know
A little familiarity with embedded software development will help, especially if your work includes any of these patterns:
- Basic unit testing or test-driven development concepts
- Common embedded concerns such as hardware dependencies, interrupts, and RTOS interactions
- Reading API-style tests and simple test fixtures
- Comfort with the idea of running some tests off target on a host machine
Glossary (terms used in this talk)
- TDD (Test-Driven Development): A development technique where tests are written before implementation and code is iterated until the tests pass.
- RTOS (Real-Time Operating System): An operating system designed to provide predictable timing behavior for real-time applications.
- Behavior-driven development (BDD): A development approach that expresses desired behavior in examples and tests. It helps teams align on expected outcomes and build confidence in changes.
- Developer test: A test written by the developer for the developer, usually to exercise one behavioral scenario at a time. It is used as a fast feedback mechanism while shaping code and design.
- Platform-independent code: Code that does not depend on hardware, RTOS services, or other target-specific behavior. This kind of code can often be exercised on a host machine with high feedback speed.
- Platform-dependent code: Code that relies on target-specific hardware, operating system services, or external libraries. It typically requires test doubles, host simulation, or on-target validation.
- Test double: A stand-in used during testing instead of a real collaborator. Test doubles include fakes, mocks, stubs, and similar substitutes, each trading realism, setup cost, and assertion style in different ways.
- Given-When-Then (GWT): An organizing pattern for specifying behavioral test scenarios: set preconditions (Given), perform the behavior (When), and assert postconditions (Then).
Toolbox (mentioned in this talk)
- Git: A distributed version control system for tracking changes in source code and coordinating collaborative development.
- MATLAB: A high-level numerical computing environment widely used for signal processing, data analysis, and algorithm development.
- Cucumber: A behavior-driven development toolchain used to express examples in executable, human-readable form. It is commonly associated with Gherkin-style scenarios.
- SpecFlow: A behavior-driven development framework for .NET that binds structured scenarios to executable tests. It is often used for Gherkin-style specifications.
- JAMA: A requirements and test management platform used to organize specifications, traceability, and validation artifacts. It is commonly used in more formal development processes.
- DOORS: A requirements management tool used to capture, trace, and maintain formal system requirements. It is often used where documentation and traceability are important.
Final thoughts
Practical and opinionated, this talk gives embedded developers a concrete way to think about tests as part of design, not just verification. The value is a clearer mental model for separating behavior from implementation, plus a vocabulary for structuring host-side tests around real dependencies. It will especially help engineers who work near hardware, maintain large firmware codebases, or need to explain testability to a team. The spirit is simple: make the code easier to trust by making the behavior easier to see.
This overview is AI-generated from the session transcript. Spot an issue? Let us know.
The key to making the "given" part manageable is to use what Adam calls the "natural flow" in the way he shows it: by creating a set of helpers that advance to each state, layering the helpers so that each one depends on the previous. That chains through the FSM state by state.
Then create a test fixture for each state. The setup function for the fixture calls the helper function that gets to that state. Short and simple. It doesn't matter how many states the helper has to chain through, how deep those nested calls go. It all happens at machine speed, kicked off by one line function call in the setup function. Then if there's any remaining "given" setup for a particular test, it's operating in the context of already being in that state.
That's the big value of fixtures to group a set of related tests together. Common setup that's repeated for each of the test cases. So fixtures are themselves just another type of helper.
Helpers really are a significant part of BDD to keep the test suite simple and maintainable with minimum repetition. They make the test code super-DRY.
I enjoyed your presentation. I have been developing embedded systems in a variety of industries since the very early 1980's. I now realize that I have been taking a BDD approach all along, just didn't know there would one day be a name for it and I didn't do the formalized unit tests until I met James Grenning and was introduced to TDD about a decade ago. Before that I did a ton of testing at the outer layers/interfaces during development even if it meant having to develop bespoke test software and equipment. As you pointed out, during development you will find more and more features to add as you proceed and will have very high confidence in the features already developed and tested. When a problem comes up it is usually easy to spot because it will likely be in something you just added.
I’m developing my first Zephyr app. It uses TF-M. The SPE is simple; exposing only a few api calls. The NSPE is more complex and consists of 12 tasks.
I would like to test each of these tasks, as I write them, with BDD. Would I use QEMU to simulate zephyr? Are there other alternatives?
Unfortunately I don't have direct experience with Zephyr on QEMU, but my understand is that it should work. I also don't have enough experience with Zephyr to say if there other alternatives.
Remember that the premise here is that by segregating Platform Dependent code (the things that depend on Zephyr and the specific target HW) from Platform Independent code, using lightweight abstraction layers to do that, you can develop the PI code with BDD. That leaves you with reliable building blocks. So here that is the code that runs inside your Zephyr tasks once a given task takes control.
Another side to that premise is that only a small portion of the code is actually PD, for instance the specific Zephyr calls that you make. That includes any task management and concurrency control.








When working for example on state machines, the 'given' may require to reach a specific state.
In Test Your State Machine Monstrosities, Adam Fraser-Kruck showed two ways to reach this phase in test: either hand of god (setting manually the state variable) or natural flow where you go through each step to reach the initial state of your test. I get that the first one is contrary to BDD principles as it requires accessible private data. For the second though I'm wondering how a BDD test would work: I'm afraid the 'given' part of the test can get very long -> you may end up with a test that is too big.
How would you handle such scenarios ?