Multiprocessor debugging challenges

1Recognizing the significant impact on time to market which the the task of debugging complex multiprocessing systems presents, vendors are introducing solutions that capitalize on new processor devices with multiple cores and heterogeneous systems based on FPGAs and multiple processors.

Two key hurdles facing developers of large DSP systems are the management and debugging of these complex configurations. A look at a typical  configuration reveals many processing nodes interconnected by a switched fabric such as Serial RapidIO. Developers have approached the challenge by wielding source-level debugging tools originally designed for single processor applications only to find  that these don't scale well when applied to a complex DSP system with real-time dataflow. One of the most important reasons is that each instance of a debugger requires its own connection.

These large multiprocessor DSP systems can’t quit their day (or night) jobs. Performing such tasks as searching large parts of the electro-magnetic spectrum for emissions, plus monitoring and processing many hundreds of active channels, demands that all the processors  in a system be operating and interacting optimally with each other. For defense and aerospace applications including synthetic aperture multi-mode radar or signals intelligence, if errors occur or performance is degraded, it is very difficult to identify where this might be taking place and yet more difficult to name the individual processor that may be the root cause of the problem.

Resource management

With systems comprised of tens or even hundreds of processors the typical edit/compile/load/debug cycle becomes tedious and error prone, even if scripted. The system may contain processing nodes of different revision states, plus processors may each have their own image, or groups of processors may need to be loaded with the same image. Whenever a hardware or configuration change is made, revision states, as well as the required images, may change. Tools are becoming available to identify processors, their resources, groups of processors, and overall system configuration in order to automate the downloading and verification of current or latest versions of images before and during debugging sessions. This practice ensures that each processing node is of a known configuration and contains the appropriate image before debugging starts. These tools not only reduce the number of initialization errors, but save considerable time. For example, a typical time saving of five to ten minutes per reset can add up to over an hour of active debugging time saved each day.

Grouping for an answer?

Traditional source-level debugging tools were created for single-processor environments. However, a typical complex DSP has many processors, often organized as groups sharing data or executing similar code and communicating within the group via a switched fabric or network.

Performance degradation or erroneous results may be caused by many potential issues such as:

n  Race conditions

n  Memory leaks

n  Buffer starvation

n  Buffer overflow

n  Loss of synchronization

All these issues are difficult to pinpoint to a particular time period or processor. Developers have come up with various methods to monitor and debug groups of processors. One method uses instrumentation to capture and time-stamp the state of each processor at predetermined events within the user's code, or at the operating system call level, using a tool such as Wind River Systems' System Viewer. The instrumentation approach generally allows the system's performance to be monitored visually at the coarse level online, but so much data can be captured that off-line analysis is often the only method to obtain the necessary granularity. Instrumentation is most useful to gain insights into performance degradation before source-level debugging an individual processor.

Another method is to include in-line logging within the code, sending results to a console port. However, this is not really a practical approach for more than a small number of nodes and, being intrusive, it affects the performance and determinism of each processor. Similarly, using multiple instances of a single processor debugger is clearly unwieldy beyond a small number. Neither does it address the issue of time-coherence, as breakpoints on each processor may occur at different times or be triggered by different events.

Breakpoint breakthrough

A better approach, that relies on a single breakpoint on a single processor, addresses many of the downsides that the instrumentation or  in-line logging methods present.

Setting a single breakpoint on one processor and using that breakpoint to halt selected other processors at the same instant enables single stepping through the problem, using one or many processors, until the error point is located. Curtiss-Wright Controls Embedded Computing (CWCEC) has developed this unique capability to breakpoint multiple processors as part of its Continuum Insights tools (Figure 1) for DSP development, which also offers extensive complementary tools for resource and Flash management for use with its dual and quad multicomputing engines.

Figure 1: Continuum Insights System Monitoring Board View - CHAMP-AV6
(Click graphic to zoom by 1.5x)

Until recently many of the management and debugging tools for multiprocessing environments were limited in their ability to resolve the complex, interactive, and time-critical problems of such systems. The additional time burden and effort faced by system engineers who must debug these complex multiprocessing systems can detrimentally affect time to market for systems critically needed by today’s war fighter. Vendors have now recognized these needs, and solutions to meet multiprocessor debugging challenges are set for further rapid expansion that takes advantage of new processor devices with multiple cores and heterogeneous systems based on FPGAs and multiple processors. The ability to efficiently debug multiprocessor systems will become increasingly important as these systems become the norm for the next generations of multi-spectral, reconfigurable sensors.

Robert Hoyecki is Director of Advanced Multi-Computing at Curtiss-Wright Controls Embedded Computing. Rob has 15 years of experience in embedded computing with a focus on signal process products. He has held numerous leadership positions such as application engineering manager and product marketing manager. Rob earned a Bachelor of Science degree in Electrical Engineering Technology from Rochester Institute of Technology.

Rob can be reached at