Top DSP-FPGA advances in 2012
The year 2012 is winding down to its final few weeks, so it’s a good time to review the most significant advances in DSP and FPGA technology over the last year and gaze into the crystal ball to see what may lie ahead. While any such list is necessarily subjective, I am confident that 20 nm technology, 2.5D ICs, and FPGA SoCs will play a significant role in defining the areas that the industry will be focusing on in 2013 and beyond.
Gearing up for 20 nm
Even though pilot production of 20 nm devices will not begin until sometime in 2013, according to semiconductor foundry TSMC’s latest roadmap, FPGA suppliers Altera and Xilinx have already announced their plans for next-generation products to succeed the 28 nm devices they started producing in volume just this year. Many semiconductor companies are currently building 20 nm test chips from which they are beginning to predict performance, power, and density improvements for future products.
In TSMC’s Q3 earnings conference call, Chairman and CEO Morris Chang said that “from 28 nm to 20 nm, the performance gain is 15 percent to 20 percent at the same total power. And the power reduction is 20 percent to 25 percent at the same speed.” Xilinx is being more bullish with their predictions, saying that their 20 nm 8 series FPGAs will double the performance and halve the power of their current 28 nm 7-series devices. Xilinx is also forecasting that logic density will increase by 1.5x to 2x at 20 nm. Applications that Xilinx sees as most benefitting from 20 nm FPGAs include 100G to 400G wired networks, LTE-Advanced wireless base stations, embedded vision applications, and next-generation system acceleration and connectivity. The company says that they have made architectural advances that will increase FPGA resource utilization beyond 90 percent, and that new algorithms will deliver 4x faster design closure during placement and routing. Without being specific about next-generation architectures, Xilinx has said they are planning 20 nm versions of their Zynq FPGA SoCs with heterogeneous processor cores. Users can expect advances in critical power management functions, higher system bandwidth between the processor and programmable fabric, and faster I/O, transceivers, and memory interfaces.
Altera previewed their 20 nm innovations in September, predicting that their next-generation variable-precision DSP architecture will deliver greater than 5 TFLOPs of IEEE 754 floating-point performance. Higher speed serial interfaces for 100G and 400G systems are also a focus for Altera, with 20 nm FPGAs integrating 40 Gbps transceivers for interfacing chip-to-chip or chip-to-optical modules. In power management, Altera says that the combination of process technology and circuit techniques, such as adaptive voltage scaling, will yield “up to 60 percent” lower power consumption compared to previous generation devices. Altera will also migrate their ARM-based FPGA SoCs to 20 nm, predicting that the processor subsystem performance will increase by 50 percent. To facilitate easier hardware-software co-design to exploit the parallel computing capabilities of FPGA SoCs, Altera has also announced availability of the first SDK for the OpenCL parallel programming language.
The 2.5D stepping stone to 3D ICs
We have yet to see true 3D DSP-FPGAs, where die would be stacked upon each other and connected with Through-Silicon Vias (TSVs). However, “2.5D” ICs, with multiple heterogeneous die connected through a passive silicon interposer, have moved past the oddity stage to a point where they are now emerging as viable alternatives to monolithic integration. The Xilinx Virtex-7 2000T was the first device to exploit this technology, interconnecting multiple 28 nm FPGA slices mounted on a silicon substrate that is manufactured in a separate 65 nm process. Because of the high logic density of the 2000T (at nearly 2 million gates), the devices have become especially popular for use in ASIC prototyping platforms.
Xilinx then introduced the Virtex-7 H580T in May, which mixes 28 nm FPGA die with 28 Gbps transceiver die in the same package. The benefits of the 2.5D IC solution, versus an integrated monolithic design, are that the respective manufacturing processes can be optimized independently, and jitter-sensitive transceivers can be physically separated from digital switching noise sources in the logic die. Building on 20 nm developments, Xilinx is planning to add 33 to 56 Gbps transceivers and wide I/O memories in their next-generation 2.5D ICs. Xilinx says that improvements in die-to-die interconnect will increase bandwidth by 5x, and that logic capacity will double with the new technology.
Altera has been working with TSMC on the development of their 2.5D IC process, which utilizes the foundry’s turnkey Chip on Wafer on Substrate (CoWoS) manufacturing flow. The companies announced development of a heterogeneous test vehicle in March, and Altera says that they will introduce 20 nm products using the technique. Altera’s approach is to offer customer-specific heterogeneous IC systems that can integrate FPGAs with user-customizable HardCopy ASICs, memory die, third-party ASICs, and optical interfaces.
Lines blurring for programmable DSP devices
Other than 3D ICs, the “next big thing“ in the FPGA industry has been the introduction of the FPGA SoCs that have reached production over the last year. These devices blur the lines between FPGAs, ASICs, and ASSPs by integrating hardened ARM processor cores with programmable logic fabrics. Both Altera and Xilinx have based their devices on dual ARM Cortex-A9 cores, the same as those used in most current smartphone application processors. In October, Microsemi announced the next-generation of their FPGA SoCs, the SmartFusion2 family, which combine flash-based FPGAs with a 166 MHz ARM Cortex-M3 processor. Microsemi also added more embedded memory and Multiply-Accumulate (MAC) blocks to support DSP functions.
Manufacturers of DSP ASSPs have, for several years, integrated ARM cores for control functions with DSP cores and specialized hardware accelerators to provide complete SoC solutions for dedicated applications such as wireless base station and network processors. With the introduction of a new line of multi-purpose ARM-DSP hybrid devices by Texas Instruments in November, we are seeing new competition for the programmability and configurability of FPGA SoCs. TI’s new Keystone processors combine ARM’s most powerful 32-bit A15 core with the company’s C66x DSP cores and 1G/10G Ethernet connectivity. By removing the specialized hardware accelerators, and offering the processors in various configurations, from quad A15 plus eight-core DSP to single DSPs with single or quad A15s, TI is going after the markets for acceleration of cloud-based applications and purpose-built servers. TI is targeting many of the same applications as are the FPGA manufacturers, including embedded vision, high-performance financial computing, and enterprise and industrial client-server networks. Where an FPGA SoC adds configurable DSP hardware, TI’s devices add software-programmable floating-point DSP to augment ARM cores. In both cases, manufacturers are drawn to the growing popularity of the ARM ecosystem.
Expanding DSP and FPGA opportunities
Although the latest advances differ greatly in their implementation, the end result is that engineers will see a greater number of options for incorporating programmable DSP in their systems in the coming years. The semiconductor foundries and IC design companies are being very aggressive with their early announcements of 20 nm devices, but the truth is that it will be at least two years before such devices come to market in any volume. The flexibility that heterogeneous 2.5D ICs can offer is intriguing, and they present a good alternative for many applications that won’t justify or don’t fit in a monolithic SoC. ARM’s influence on an increasing number of segments of the semiconductor industry is a significant development, and it will be interesting to see if the FPGA manufacturers offer more options and configurations of ARM cores to keep up with the flexibility that ASIC and ASSP manufacturers are providing.