Leveraging FPGA coprocessors to optimize high-performance digital video surveillance systems

Digital video surveillance systems now offer additional capabilities that make them an effective alternative to traditional analog systems. In addition to offering advanced video compression techniques, such as MPEG-4 and H.264, these systems can now be augmented with algorithms, such as stabilization, panorama, and video motion detection.

Typical requirements for commercial video surveillance systems include the following:

  • Support for one to16 cameras
  • Advanced video compression such as MPEG-4, JPEG2000, and H.264
  • Low latency encoding (one to three frames)
  • Simultaneous view and record at different frame rates
  • Encoding resolutions ranging from common intermediate formats (approximate VCR resolution) up to D1 (approximate DVD resolution)
  • Video rates ranging from two frames per second (home security) up to 30 frames per second, such as those casinos use and other premium type systems

Enhancing video surveillance quality Given a fixed bandwidth, several different methods including advanced video compression, defining the region of interest, image stabilization, and panorama can improve video quality. The most common video compression technique used today is MPEG-4. However, developers are beginning to adopt new compression techniques, such as H.264 and JPEG2000 to improve video quality, significantly enhancing detection capabilities.

Defining areas of greater interest in terms of surveillance can also enhance video quality. In areas of low interest, the system can increase the level of video compression, reducing the video bandwidth and processing load dedicated to those regions. This, in turn, enables the system to focus more closely on areas of high interest such as outer doors, windows, interiors of high security areas, or anticipated or previously detected motion areas. Essentially, by defining areas of interest and focusing on those areas of greater concern, the system can reduce the number of false alarms while increasing the likelihood of detecting a true security breach.

Camera movement and/or camera vibration can also degrade video surveillance quality. Camera movement, of course, may be necessary to ensure coverage of an entire surveillance sector, and environmental factors such as wind or passing vehicles may cause vibration. Either of these factors, however, can reduce compression quality and possibly result in dropped video frames, thereby degrading the quality of the surveillance system. In worst-case scenarios, these factors can cause system-processing overload.

“Engineers designing digital video surveillance systems will increasingly leverage the combined power of DSP processors and high-performance FPGAs to deliver the overall video imaging quality required.”

Digital video stabilization techniques featuring several different types of algorithms can now overcome camera vibration, but each algorithm follows the same principal. Certain parts of the image are compared to the previous image. The picture is offset by various vectors, and a search is used to find the point where the correlation between the images is highest. The offset vector is then applied to the entire image, with the edges slightly cropped, but most of the image remains stable.

The ability to provide a panoramic view is also a critical feature in video surveillance systems that incorporate swivel cameras. This feature minimizes the number of cameras required to cover a particular site and enables security personnel monitoring the system to view a wider area at a glance or focus on a particular area where a potential security breach has been detected. When the panorama algorithm is coupled with a swiveling camera, the system can track the movement of the video image. Rather than shifting the image back to center, the system expands it to a larger resolution. The new image is “stitched” onto the old with the overlapping parts updated. The same FPGA mechanism used for stabilization is used for panorama, with the stitching requiring a minimal computational addition.

Video motion detection, which is effective both indoors and outdoors regardless of time of day, can significantly enhance the capabilities of a digital video surveillance system. This feature uses a tracking algorithm that receives noisy detections from surveillance cameras and filters out insignificant motions caused by noise in the image, camera movements caused by environmental factors such as wind, and false images caused by clouds or moving branches. A wide variety of algorithms is available to implement this function, which facilitates tracking intruders. Additionally, combining such algorithms with traditional motion detection can minimize false identifications.

Motion detection algorithms range from the very simple high pass, or edge detection filters implemented using several hundred logic elements, to very complex algorithms that can overcome rain and wind interference and differentiate between people, small animals, cars and other objects as well. Advanced algorithms typically use motion tracking, similar to the motion estimation blocks used in MPEG compression. The motion of various parts of the image is tracked over time and if the movement appears consistent, an intruder is detected and tracked. This allows the system to ignore rain, dust, and light changes.

Video archiving Video archiving is a necessary feature in most modern digital video surveillance systems that enable security personnel to document possible intrusions and maintain video used to identify intruders. It can be done either locally, where the image was created, or remotely at a more secure location. An IP camera or video servers send the compressed video to a back office where a central recording unit collects the video streams and archives them. This configuration accommodates inexpensive end units and easy video management, but requires a very reliable network with high bandwidth to support all cameras transmitting at once. Another configuration uses local hard disk recording, allowing the back office to view only one camera at a time or access any of the archived video on any unit.

Implementation with DSP processors and FPGA coprocessors A combination of DSPs and FPGA coprocessors deliver the very highest performance and highly flexible signal processing required for digital video surveillance systems. Mango DSP, for example, has developed an architecture for video surveillance based on a low-cost Altera Cyclone II FPGA and a Texas Instruments DM642 DSP (see Figure 1).

Figure 1

DSP benefits include high clock rates (up to 1 GHz), C/C++ language-based development, built-in memory management, and built-in I/O interfaces. At the same time, DSPs have a limited number of instructions/clocks and multipliers, fixed word sizes, and I/O interfaces. In addition, most DSPs allow very limited inter-processor communication, often relying on low-speed buses such as PCI to connect to other DSPs.

FPGAs, on the other hand, include a high number of instructions and clocks, one to two orders of magnitude more multipliers than DSPs, and flexible word size. Altera’s Cyclone II family of FPGAs, for example, has up to 150 18 x 18 multiplier/accumulators per device, each capable of running at 250 MHz, as well as nearly 70,000 standard logic elements. FPGAs also allow access to advanced memory devices such as DDR, DDRII, Reduced Latency DRAM (RLDRAM), and QDR. Advanced FPGAs can be connected to other FPGAs or other devices, such as DSPs, via 1 Gbps high-speed LVDS and multi-gigabit SERDES buses. FPGA drawbacks include a longer development time and clock rates that are about one-half of DSP peak processing clock rates.

DSPs and FPGAs clearly complement one another. While DSPs enable rapid development of new and complex algorithms, they can only run two to four calculations at a time. On the other hand, FPGAs can perform mathematical operations on an entire vector or matrix at one time. Furthermore, FPGAs are excellent for connecting multiple processing nodes together, distributing the data among DSPs, and collecting and recombining the sub-calculations into a single output stream.

In video surveillance applications, FPGAs can be used for video preprocessing functions such as video stabilization, filtration, and motion detection, and coprocessing for video compression. Table 1 illustrates typical FPGA usage for a single channel of video at 30 fps D1 resolution. In this example, the DSP would process a portion of the JPEG2000 implementation as well as the network interface.


Cyclone II logic elements (mid-range FPGA size)

Percent of EP2C35 used

Video stabilization

3000 logic elements


Motion detection

2000 logic elements


AES encryption

1000 logic elements


ATA/IDE hard disk interface

1000 logic elements


JPEG2000 coprocessing

25,000 logic elements


Table 1

A winning combination for highest quality video imaging Digital video surveillance is just one of the many video imaging applications that increasingly require very high signal processing and memory bandwidth processing, as well as the ability to communicate among multiple processing units to deliver a required level of resolution and live video viewing. While DSPs provide rapid development of the complex algorithms necessary to run these applications, FPGAs can much more quickly and effectively perform the actual mathematical calculations. It is likely that engineers designing digital video surveillance systems will increasingly leverage the combined power of DSP processors and high-performance FPGAs to deliver the overall video imaging quality required.