Designing for the future of FPGAs beyond Moore's Law
Designing for the future of FPGAs beyond Moore’s Law
FPGAs have maintained a symbiotic relationship with Moore’s Law, as the progressive shrinking of semiconductor device dimensions has enabled manufacturers to steadily increase the number of programmable logic gates and memory cells in their ICs with each new process node. At the same time, FPGAs have long served as one of the leading-edge drivers for advancing semiconductor technology by providing a complex fabric with a mix of functionality that challenges foundries’ capabilities to deliver working silicon for a broad range of applications.
That relationship is being tested, however, as the limits imposed by physics and economics will inevitably dictate an end to conventional transistor scaling. While FPGA vendors continue to leverage Moore’s Law, they are also developing alternatives that will allow them to meet designers’ demands for many years to come.
More than Moore with 3D ICs
If you can’t build more transistors into a single silicon die, one alternative is to build more die into a single package. While Multi-Chip Modules (MCMs) were introduced as early as the 1960s, those early techniques for achieving higher integration evolved from PCB and hybrid packaging techniques. The difference with today’s 2.5D and 3D ICs is that manufacturers now utilize IC fabrication processes to assemble and connect multiple die with the same density as monolithic solutions. As an example, in Xilinx’s 2.5D 2000-T, multiple 28 nm FPGAs are interconnected through a passive silicon interposer, which is fabricated with the metallization layers of a 65 nm IC process.
An additional advantage of the 2.5D/3D die stacking processes is that ICs that are fabricated in different process technologies can be mixed in the same package. Shrinking transistors yields more density for logic, but analog circuits become more difficult to implement. At the Hot Chips Conference in August, Ephrem Wu, Senior Director of Advanced Communications at Xilinx, described how 2.5D technology enabled the integration of Virtex-7 HT FPGAs with 28 Gbps transceivers in the same package. The transceivers require higher performance, while low power is the objective for the logic. In the Xilinx XC7VH580T, the interposer provides multiple planes of interconnect, which Xilinx uses for a ground plane and signal routing. The ground plane provided shielding for each high-frequency signal to minimize crosstalk and other signal integrity issues.
In his tutorial during the Hot Chips session on die stacking, Wu described the potential to take 2.5D technology further: to the integration of photonics for 100 Gbps networking communications. A Vertical-Cavity Surface-Emitting Laser (VCSEL) and photo detector could be mounted on an interposer alongside an FPGA, with optical vias and lens arrays enabling both optical and electrical communications in the same package. While this technology has been described in conference papers, Wu said that standards and an optoelectronics supply chain must be developed in order to bring it to reality.
Getting ready for 20 nm
Altera also sees a shift coming in transceivers, to integrated optoelectronics in the not too distant future. Altera Senior VP and CTO Misha Burich says for communications beyond 56 Gbps optical transceivers will be required. But before that, Altera is looking to continuing increases in performance and functional density at the next 20 nm silicon process node. Burich says that 20 nm will enable FPGAs to integrate transceivers for 40 Gbps chip-to-chip communication. According to Altera, the next step in Moore’s Law will provide the foundation for building Common Electrical Interface (CEI) 56G-compliant transceivers to support connectivity for 400G optical networks and line cards.
Burich sees 20 nm as an enabler of “silicon convergence,” both on-die and through the use of 3D packaging. He says that Altera has built 20 nm test chips with TSMC that demonstrate the ability to increase performance from 1 TFLOPS to 5 TFLOPS in next-generation DSP blocks. The company will continue to integrate ARM processor cores into FPGA SoCs, along with variable-precision DSPs and IP blocks to complement the programmable logic fabric.
Altera also has plans for heterogeneous 3D IC integration. Burich says that integrating DRAMs will be a natural complement to the embedded processing power of FPGAs, along with adding optical modules and enabling customer-specific integration with proprietary ASICs. In March, Altera announced development of a heterogenous 3D FPGA IC prototype using TSMC's Chip-on-Wafer-on-Substrate (CoWoS) integration process. TSMC is planning to offer a complete turnkey 3D IC integration flow, which would likely be limited to ASICs that are also produced in the company’s foundries.
With all of the potential for integration of heterogeneous cores with FPGAs, hardware-software co-design becomes an even greater issue. Altera has been a supporter of OpenCL, which Burich says will enable developers who are familiar with CPU-GPU platforms to migrate to C-level programming of FPGA SoCs. In August, the company began providing early access to their OpenCL for FPGAs program, which includes a training course, with collateral and technical demonstrations. The 20 nm design platform that Altera is developing in Quartus II will incorporate OpenCL, along with Qsys system-level network-on-chip technology, for interconnecting synthesized blocks and the company’s DSP Builder tools.
Time as a design dimension
The space-time continuum is a topic usually reserved for science fiction. However, Tabula, a relative newcomer to the FPGA industry, is seeking to take FPGAs in a different direction with their “Spacetime” 3D FPGAs. Rather than build additional physical dimensions into their programmable devices, Tabula’s architecture uses time as the third dimension, allowing the on-chip logic, memory, and interconnect resources to be dynamically reconfigured, at up to a 2 GHz clock rate. Tabula CEO Dennis Segers, a former Xilinx executive, says that traditional FPGAs are 90 percent interconnect and are therefore imbalanced for data flow, which presents a performance limitation. Tabula claims that their technique combines ASIC functional density with the benefits of FPGA programmability and time-to-market. Tabula benchmarks show 3.7x DSP performance, measured in Million samples per second (Msps) per mm2 of silicon area, 2x memory density, and 2.5x logic density compared to conventional FPGAs.
Tabula is also taking advantage of Moore’s Law in their products. In February, the company announced an agreement with Intel for access to that company’s 22 nm Tri-Gate CMOS process. The Intel Custom Foundry typically provides limited access to external customers, but it has also taken on another FPGA startup as a partner, Achronix Semiconductor. Like incumbents Xilinx and Altera, Achronix is targeting high-performance 100G Ethernet communication applications with their Speedster 22i FPGA Platform.
Planning for FPGAs’ future
Innovation in programmable devices continues at a very rapid pace. In the next few years, we can expect manufactures to take advantage of both design and manufacturing techniques to continue to advance the capabilities of FPGAs. The end of Moore’s Law may be on the horizon, but users of FPGAs can look forward to many more generations of increasing performance and functionality.