Common Issues Embedded Software Engineers in Avionics Experience
While the use of DO-178B has certainly eliminated many common issues related to real-time development, there are issues that are common throughout the real-time embedded software industry that still plague the avionics world. I have picked several mistakes made by embedded software engineers in avionics that I believe are prevalent in the DO-178B/C industry and provide some tips for alternate ways around those issues.
#10 Large if-then-else and case statements
It’s not uncommon to see large if-then-else or case statements in embedded code. These are problematic from three perspectives:
- Such statements are extremely difficult to debug, because code ends up having so many different paths. If statements are nested it becomes even more complicated.
- The difference between best-case and worst-case execution time becomes significant. This leads to either under-utilizing the CPU, or the possibility of timing errors when the longest path is taken.
- The difficulty of structured code and decision coverage testing grows exponentially with the number of branches, so branches should be minimized.
Computational methods can often provide an equivalent answer. Performing Boolean algebra, implementing a finite state machine as a jump table, or using lookup tables are alternatives that can reduce a 100-line if-else statement to less than 10 lines of code.
#9 Delays implemented as empty loops
Real-time software often uses delays to ensure that data sent or received over an I/O port has time to propagate. These delays are frequently implemented by putting a few no-ops or empty loops (assuming volatile is used if the compiler performs optimizations). If this code is used on a different processor, or even the same processor running at a different rate (for example, a 25MHz vs. 33MHz CPU), the code may stop working on the faster processor. This is especially something to avoid, since it results in the kind of timing problem that is extremely difficult to track down and solve, because the symptoms of the problem are sporadic. Instead, use a mechanism based on a timer. While some DO-178B certified RTOS’s provide these functions, but if not, one can still easily be built.
Following are two possibilities to build a custom delay(int usec) function. Most count-down timers allow the software to read a register to obtain the current count-down value. A system variable can be saved to store the rate of the timer, in units such as microseconds per tick. Suppose the value is 2µs per tick, and a delay of 10µs is required: the delay function busy-waits for five timer ticks. Suppose a different speed processor is used—the timer ticks are still the same. Or if the timer frequency changes, then the system variable would change, and the number of ticks to busy-wait would change, but the delay time would remain the same.
If the timer doesn’t support reading intermediate count-down values, an alternative is to profile the speed of the processor during initialization. Execute an empty loop continuously and count how often it occurs between two timer interrupts. Since frequency of the timer interrupt is known, a value for the number of microseconds per iteration can be computed. This value is then used to dynamically determine how many iterations of the loop to perform for a specified delay time.
#8 No analysis of hardware peculiarities before starting software design
How long does it take to add two eight-bit numbers? What about two 16-bit or 32-bit numbers? What about two floats? What if an eight-bit number is added to a float? A software designer who cannot answer these questions off the top of his or her head for the target processor isn’t adequately prepared to design and code real-time software. Here are sample answers to the above measurements for a 6MHz Z180 (in microseconds): 7, 12, 28, 137, and 308. Note that it takes 250{ef04c739b3e25ea28707ab7812c6a8de5cab8201d48fabbc2f6e32e45cc7427a} more time to do float plus byte than float plus float, due to the long conversion time from byte to float. Such anomalies are often the source of code that overloads the processor.
In another example, a special purpose floating-point accelerator did floating-point addition/multiplication 10 times faster than a 33MHz 68882, but sin() and cos() took the same amount of time. This is because the 68882 has the trigonometric functions built into its hardware, while the floating-point accelerator did those particular functions in software.
When code is implemented for a real-time system, being aware of the timing implications of every single line of code is important. Understand the capabilities and limitations of the target processor(s), and redesign an application that makes excessive use of slow instructions. For example, for the Z180, doing everything in float is better than having only some variables float and lots of mixed-type arithmetic.
#7 One big Superloop
When real-time software is designed as a single big loop, we have no flexibility to modify the execution time of various parts of the code independently. Few real-time systems need to operate everything at the same rate. If the CPU is overloaded, one of the methods to reduce utilization is to selectively slow down only the less critical parts of the code. This approach works, however, only if the multitasking features of an RTOS are used, or the code was developed based on a real-time executive.
#6 Too many inter-module dependencies
The dependencies between modules in a good software design can be drawn as a tree. A dependency diagram consists of nodes and arrows, such that each node represents a module (such as one source code file), and the arrows show dependencies between that node and other modules. Modules on the bottom-most row are not dependent on any other software module. To maximize software reusability, arrows should always point downward, and not upward or bi-directionally.
The dependency graph diagram is a valuable software engineering aid. Given such a diagram, it’s easy to identify what parts of the software can be reused, create a strategy for incremental testing of modules, and develop a method to limit error propagation through the entire system. Each circular dependency (a cycle in the graph) reduces the ability to reuse the software module. Testing can only occur for the combined set of dependent modules, and errors will be difficult to isolate to a single module. If the graph has too many cycles, or a major cycle exists where a module at the bottom-most level of the graph is dependent on the topmost module, then not a single module is reusable.
To best use dependency graphs to analyze the reusability and maintainability of software, write code that makes it easy to generate the graph. That is, all extern declarations for exported variables in functions in a module xxx should be defined in file xxx.h. In module yyy, simply looking at what files are #include’d allows determination of that module’s dependencies. If this convention is not followed, and an extern declaration is embedded in yyy.c instead of using #include in the appropriate file, then the dependency graph will be erroneous and an attempt to reuse code that appears to be independent of the other module will be difficult.
#5 Using message passing as primary inter-process communication
When software is developed as functional blocks, the first thought is to implement inputs and outputs as messages. This works well in non-real-time environments such as for distributed networking and major subsystems where there are different levels real time requirements where its use is limited and non-blocking. It is problematic, however, in a real-time system when it is used between threads in a process or the primary means of communicating between processes. Three major problems arise when using message passing like this in a real-time system:
- Message passing requires synchronization, a primary source of unpredictability to real-time scheduling. Functional blocks end up executing synchronously, and thus analysis of the system’s timing is difficult, if not impossible.
- In systems with bi-directional blocking communication between processes or any kind of feedback loop, deadlock is a possibility.
- Message passing incurs significantly more overhead as compared to shared memory.
While messages may be required for communication across networks and serial lines, it’s often inefficient when random-access to the data is possible, as is the case for inter-process communication on a single processor. State-based communication is preferred in embedded systems to provide higher assurance. A state-based system uses structured shared memory, such that communication has less overhead. The most recent data is always available to a process when the process needs it.
Steenstrup and Arbib (Port Automata and Algebra of Concurrent Processes, 1983) developed the port-automation theory to formally prove that a stable and reliable control system can be created by only reading the most recent data. Costly blocking is eliminated by creating local copies of shared data, to ensure that every process has mutually exclusive access to the information it needs. Using states instead of messages also provides robustness if the possibility of lost messages exists, if code does not all execute at the same rate, and if implementing with shared memory generates less operating system overhead.
#4 Error detection and handling are an afterthought and implemented through trial and error
Error detection and handling are rarely incorporated in any meaningful fashion in the software design. Rather, the software design focuses primarily on normal operation, and any exception and error handling is added after the fact by the programmer. The programmer either puts in error detection everywhere, many times where it’s unnecessary but its presence affects performance and timing; or does not put in any error handling code except on an as-needed basis as workarounds for problems that arise during debugging and testing.
Either way, the error handling isn’t designed and its maintenance is a nightmare. Instead, error detection should be incorporated into the design of the system, just as any other state. Thus, if an application is built as a finite state machine, an exception can be viewed as an input that causes action and a transition to a new state.
Adding a lightweight or binary encoded error logging system with timestamping, potentially with processing offline, into the infrastructure can provide this kind error detection and handling with minimal overhead during development that can be pre-processed out or optimized for production builds.
#3 Configuration information in #define statements.
Embedded programmers continually use #define statements in their code to specify register addresses, limits for arrays, and configuration constants. Although this practice is common, it is undesirable because it increases the difficulty of reusing the software in for other similar applications.
The problem arises because a #define is expanded everywhere in the source code. The value might therefore show up at 20 different places in the code. If that value must change in the object code, pinpointing a single location to make the change isn’t easy. As an example of software reusability, suppose that code for an I/O device is implemented with every address of each register #defined. That same code can’t be reused if a second identical device is installed in the system. Instead, the code must be replicated, with only the port addresses changed. Alternately, a data structure that maps the I/O device registers can be used.
Another issue is that #define statements are used frequently instead of real constants with type declarations. #define statements should be limited to macros and also where scope requires a #define instead of a constant be used.
#2 Inappropriate use of or lack of use of interrupts
Interrupts are perhaps the biggest cause of priority inversion in real-time systems, causing the system to not meet all of its timing requirements. The reason for this delay is that interrupts preempt everything else and aren’t scheduled. If they preempt a regularly scheduled event, undesired behavior may occur. An ideal real-time system has no interrupts. Many programmers will put 80{ef04c739b3e25ea28707ab7812c6a8de5cab8201d48fabbc2f6e32e45cc7427a} to 90{ef04c739b3e25ea28707ab7812c6a8de5cab8201d48fabbc2f6e32e45cc7427a} of the application’s code into interrupt handlers. Complete processing of I/O requests and the body of periodic loops are the most common items placed in the handlers. Programmers claim that an interrupt handler has less operating system overhead, so the system runs better. While it’s true that a handler has less overhead than a context switch, the system doesn’t necessarily run better for several reasons:
- Handlers always have high priority and can thus cause priority inversion
- Handlers reduce the schedulable bound of the real-time scheduling algorithm, thus counteracting any savings in overhead as compared to a context switch
- Handlers execute within the wrong context and for the use of global variables to pass data to the processes
- Handlers are difficult to debug and analyze because few debuggers allow the setting of breakpoints within an interrupt handler.
Instead, minimize the use of interrupts when possible. For example, program interrupts so their only function is to signal an aperiodic process or server. Or convert handlers from periodically interrupting devices to periodic processes. If you must use interrupts, use only real-time analysis methods that account for the interrupt handling overhead. Never assume that overhead from interrupts and their handlers is negligible. Ensure mutually exclusive access to data buffers and registers in the interrupts. Do the absolute minimum amount of work possible in the handlers and exit.
#1 Using global variables
Global variables are often frowned upon by software engineers because they violate encapsulation criteria of object-based design and make it more difficult to maintain the software. While those reasons also apply to real-time software development, avoiding the use of global variables in real-time systems is even more crucial.
If using an RTOS, processes are implemented as threads or lightweight processes. Processes share the same address space to minimize the overhead for performing system calls and context switching. The side effect, however, is that a global variable is automatically shared among all processes. Thus, two processes that use the same module with a global variable defined in it will share the same value. Such conflicts will break the functionality; thus, the issue goes beyond just software maintenance.
Many real-time programmers use this to their advantage, as a way of obtaining shared memory. In such a case, however, care must be taken and any access to shared memory must be guarded as a critical section to prevent undesirable problems due to race conditions. Unfortunately, most mechanisms to avoid race conditions, such as semaphores, are not real-time friendly, and they can create undesired blocking and priority inversion. The alternatives, such as the priority ceiling protocol, use significant overhead.
The above mistakes, if avoided or rectified, can lead to weeks or months of savings in manpower, especially as delivery is nearing. It will lead to an increase in the quality and robustness of the application and allow future reuse of the software for similar programs. If you need support on your project from a team who knows how to avoid these mistakes, contact Performance today to discuss how we can meet your needs.