Home > On-Demand Archives > Q&A Sessions >

Live Q&A - Self-testing in Embedded Systems

Colin Walls - Watch Now - EOC 2021 - Duration: 19:50

Live Q&A with Colin Walls for the talk titled Self-testing in Embedded Systems
italicssurround text with
boldsurround text with
**two asterisks**
or just a bare URL
surround text with
strikethroughsurround text with
~~two tilde characters~~
prefix with

Score: 0 | 3 years ago | 1 reply

Great talk Colin, would you recommend a book to go even deeper on this topic?

Score: 0 | 3 years ago | no reply

An interesting idea. The topic does have potential for more exploration.

Score: 0 | 3 years ago | 1 reply

Hi Colin, brilliant presentation. In the unfortunate event of having the guarding words on a stack overwritten due to under/overflow, presumably a reset would be the only sensible way out at that point as the task relying on that task would be operating with an undefined context?

Score: 0 | 3 years ago | no reply

This depends on the application. Once the guard word has been corrupted, we know the stack has overflowed, but there is still a chance that nothing else has been corrupted. Depending what the system is, there might be the possibility of going into a "safe state" instead of a full reset, if continuing operation would be highly desirable.

Score: 0 | 3 years ago | 1 reply

Hi Colin. great talk
I have a question related to self-testing in a coupled system comprised of 2 separate interfaces (PCBs, MCUs, SBCs). Both are equally vital for the overall system to function.
What kind of mechanism would you setup on each side to ensure the overall safe operation of the system?
for example, would you consider using a heart-beat signal on a GPIO that, if not received, will trigger a power reset of the other interface?
would it make sense for the 2 interfaces to share their current state to the other, so that if a reset is needed, they can quickly fetch the latest state?

Score: 0 | 3 years ago | no reply

I would say that, if you have 2 equally vital processors, each one monitoring the other would be a very good approach. You could use either kind of watchdog mechanism. Either a heartbeat that, if it stops, triggers a reset, or a ping that needs a response in a specific timeframe. Which one depends on your software architecture.

Score: 0 | 3 years ago | 1 reply

Great talk!! I really enjoyed your presentstion. Thank you

Score: 0 | 3 years ago | no reply

Thanks for the feedback.

Score: 1 | 3 years ago | 2 replies

When testing memory, it's a good idea to put an access to another memory location before reading back the value in the location under test. I've worked with systems that had enough capacitance on the data bus that you would read back whatever you wrote last, which means that write/read/check memory tests would pass even if there was no memory present at that address.
Also, during a pattern test at powerup, adding reads at powers of two can be used to determine memory size, and sometimes whether there's a problem with the address lines.

Score: 1 | 3 years ago | 1 reply

I assume this is complicated even further by higher end micros now having data caching too...

Score: 0 | 3 years ago | no reply


Score: 0 | 3 years ago | no reply

Thanks Steve. Useful refinements.

Score: 0 | 3 years ago | 1 reply

I'm not sure I understand how you would use a guard word for protecting against array bounds violations. The stack example seemed easy enough. However, I don't think spinning up a task for every array to monitor the guard word is practical.

Score: 2 | 3 years ago | no reply

You would not have a task for each array. A single task could take on a number of self-testing activities.

Score: 0 | 3 years ago | 1 reply

Colin, in this talk you say "Don't think a pointer is an address. There's more to it than that." I think I have a pretty good understanding of pointers but I'm wondering if you could briefly elaborate on this statement.

Thanks for a great talk!

Score: 3 | 3 years ago | 1 reply

Thanks Gary.
The main difference between a pointer and an address is illustrated when you do arithmetic on a pointer. You can increment a pointer in various ways:
int *p;
p += 1;
p = p + 1;
These are all entirely equivalent. In each case, p is incremented to pint to the next object of type int. As an int is typically 4 bytes, 4 is added to the value of p.
Hopefully this clarifies what I was going on about.

Score: 0 | 3 years ago | no reply

Definitely. I do understand this. I just wasn't quite sure what you meant in the discussion. Thanks for (both of) your reply(s).

09:42:52	 From  Rocco Brandi : Can you give an example of trap handler?
09:43:01	 From  Erwin : Really nice overview! You mentioned static analysis tools for stack usage prediction. Have you some examples?
09:43:38	 From  Cezar Burlacu : Hi, do you recommend any tool for static stack analysis?
09:44:58	 From  Michael Kirkhart : +1 for Morse code for error messages
09:45:14	 From  Jeremy Erdmann : You talk about overriding [] operator for array bound violations.  What are your thoughts on things like C++'s std::array which requires a defined size?  Or perhaps a plug for the Embedded Template Library's etl::array.  Similarly, the ETL has the concept of a span for when passing arrays around your application.
09:45:31	 From  jvillasante : I liked the idea of checks on boot up (like the RAM check). Can you expand on that? What other checks would you think would be useful on boot up to have more confidence on a clean and well behaved system before launching applications?
09:46:38	 From  Steve Wheeler : Back in the late 1970s, my local computer club had a blind member who used an Apple II that had a redirect on character output to produce Morse code on the speaker.
09:48:07	 From  Dan Rittersdorf : Can you talk about background testing of RAM in a homogenous or heterogeneous multicore system where interrupt level won't be enough to guarantee another core won't read while the built-in test is writing RAM?    Is there a way to do that safely?
09:48:23	 From  Scott K. : MCAS anybody?
09:51:22	 From  David Kanceruk : Speaking of car stories, here's one you might enjoy: https://www.eetimes.com/ok-your-dad-primes-his-fuel-injected-engine-with-gas-and-you-fixed-it-with-a-ttl-logic-probe-what-kind-of-family-is-this-anyway/#
09:53:26	 From  busa2191 : I have had the devils job with the reset with peripherals not working till 2.5 volts. And NXP embedded uC's working to from 1.8 volts beside a voltage supervisor any issues on voltage bumping on the way.
09:57:16	 From  EricS : Texas Instruments can compute worst case stack usage with its compiler toolchain.  (Caveats: You need to manually inform it about function pointer usage. They have a way to annotate assembler routines so that their tool picks up the stack usage of assembler code.  You need to know the entry function for each one of your threads.)
09:57:26	 From  Pradeepa : Do you think self testing, given the engineering cost, needs to be implemented in devices that work in non critical applications, ex: consumer electronics?
09:58:27	 From  JackW : Could you elaborate on loop back testing of peripherals? Is this as simple as a verify of peripheral registers
09:58:32	 From  patelk5 : Can static code analysis be sufficient for a lot of the software errors you mentioned (such as array bounds checking)?
09:59:44	 From  Keith J : Thank you Colin!
09:59:55	 From  patelk5 : Thanks Colin!
10:00:11	 From  Pradeepa : Thanks Collin!
10:00:11	 From  Tom.Davies : Thank you
10:00:13	 From  Dan Rittersdorf : Great talk.   Thank you!
10:00:14	 From  Jorge Conti : Thanks Colin!
10:00:22	 From  Agnes B : Thank you
10:00:24	 From  Gary : Thanks again Colin!
10:00:30	 From  Erwin : Thank You!
10:00:30	 From  JackW : Thank you
10:00:34	 From  Lee Thalblum : Thanks Colin!
10:00:50	 From  Scott H : Thanks Colin