Logging and diagnostics are an important aspect of any embedded project.
In many cases, using an interactive debugger is not possible or practical. Hardware state can change in a few milliseconds. After a program stops on a breakpoint, a developer does not have enough time to analyze it. Collecting detailed log data and using tools for their analysis and visualization is a better approach for high-performance, multithreaded, time-sensitive embedded systems.
Since in most cases resources are limited, developers often have to make tradeoffs. On the one hand, they need to collect as much data as possible to identify the root cause of failure—whether it is the software or hardware, the status of the hardware components at the time of the failure, and the accurate timing of the hardware and software events handled by the system. On the other hand, the space available for the log is limited, and each time writing the log affects the overall performance.
The solution is buffering log data locally on a device and sending it to a remote system for detailed analysis.
This approach works fine for the development of embedded software. However, the diagnostics of the deployed systems require more sophisticated techniques.
Many embedded systems work offline and do not provide convenient access to internal logs. Developers need to design and implement other ways of diagnostics and reporting carefully. If a system does not have a display, LED indicators or beeps are often used to encode various error conditions. They are sufficient for giving information about the failure category but in most cases cannot provide the necessary details to nail it down to the root cause.
Embedded devices have dedicated diagnostics modes that are used to test the hardware components. After powering up, virtually any device or appliance performs a Power-On Self-Test (POST), which runs quick tests of the hardware. These tests are supposed to be fast and do not cover all testing scenarios. That is why many devices have hidden service modes that can be activated by developers or field engineers to perform more thorough tests.