Programmable chips head to crunch point
Guest columnist Colin Dente, CEO of Akya suggests that, innate limits on processing speeds, the time is right to start considering alternatives to traditional ‘static’ architectures
Over the years the devices we each use have become ever faster, ever more sophisticated, and ever more able to offer us dazzling new features, All this depends on semiconductors and, in particular, the much-used DSP, being able to do more within a given space.
This has become such a truism that we barely consider it any more. iPhones, computers, graphics cards and other devices get faster for one simple reason; as process geometries improve semiconductors are becoming twice as transistor-dense, and therefore twice as powerful, every eighteen months.
However there are serious problems on the horizon.
As transistor sizes continue to shrink, the ability of engineers to design chips using such high number of transistors is not increasing at anything like the same speed. The so-called ‘design gap’ is leaving designers far behind the capabilities of the hardware available to them.
Also, the march of semiconductor progress is about to slam up against some fundamental physical limitations of its medium.
Obviously electron-based ICs cannot, even in principle, get smaller than the diameter of an atom (0.1 – 0.5nm). In reality each would actually have to be constructed from basic nodes which were much larger than this (around 1nm) in order to work. At these lowest geometries there are also serious issues with current leakage.
Then there is the related issue of the failure of Dennard Scaling: Dennard Scaling, simply put, describes the fact that halving the size of transistors halves the power they consume. However, as we move to ever more advanced process nodes, Dennard Scaling breaks down as leakage effects come to dominate power consumption.
This means that the power density of chips increases – if you double the number of transistors in a given area, but don’t at least halve the power dissipation of each of those transistors, you get a net increase in power density. We are already seeing chips where the peak power density exceeds the capabilities of cooling technologies, leading to the phenomenon of ‘dark silicon’: areas of silicon that must remain unutilised at any given time in order to maintain acceptable thermal dissipation.
There are also economic problems in continuing to reduce transistor sizes. These include the increasing costs of mask sets and greater problems caused by physical flaws in silicon wafers as process geometries shrink; leading to lower yields. Many analysts now see that development of chips at lower geometries will be exponentially more expensive, both in terms of development and yield, and therefore increasingly impractical for any devices other than those with the very largest production runs.
We’ve seen these issues coming. Occasionally an architectural improvement or blue-sky technological development comes along that keeps the wolf from the door.
In a sense the rise of multicore over the last few years has been a reaction to scaling issues, Such parallel processing has proven immensely useful in DSPs. However, in keeping with Amdahl’s Law only limited returns can be had from multicore solutions with most applications, and it’s essentially the equivalent of throwing more chips at the problem. Using numerous discrete chips, whether conventionally-packaged or stacked, introduces serious cost, power and heat-dissipation problems.
Thus we’re reaching a plateau point. We can go further with our existing technologies for a while, but physics and economics are going to make big gains of the like we’ve seen with Moore’s Law over the last forty years impractical very quickly. And the nearest ‘game-changing’ solutions (photonics, quantum computing, etc.) we can see may be decades away.
I’ve been promoting the notion that what we really need in this situation is a more general shift in architectural thinking.
Today if you ask a chip designer how to extract maximum ability from a chip, they will generally see how much he can force a fixed-architecture chip to do with brute computational force. This only exacerbates the problems outlined above.
This all points, in my opinion, towards a situation where the internal dynamics of chips need to be more flexible, performing different tasks with the efficiency of hard-wired architectures, (thus avoiding the power consumption of running excessive software on-chip), but with the appropriate degree of flexibility.
My own company, for example, has designed a response to this problem in the form of a dynamically reconfigurable logic (DRL) IP system known as ART2. With this architecture parts of a chip can redefine their function on a clock-cycle-by-clock-cycle basis, thus making incredibly efficient use of a given number of transistors.
By designing IP that is intended to be part of a broader architecture we can enable you to create a chip that is ‘just flexible enough’, and avoids the problems of conventional general-purpose programmable device design. Effectively, by tempering the amount of flexible logic you use you can combine the best of both worlds; the price-point and speed of hard-wiring with the flexibility of DRL.
This kind of IP not only reduces heat by allowing chips to run at lower clock speeds in order to gain the same or higher computational ‘grunt’: It also mitigates the rising price of chip production caused by lower yields.
The economics of this approach are appealing. Chips with inbuilt flexible elements can be taped-out en-masse and redefined into specific product types via firmware changes after manufacturing. This reduces NRE and inventory costs, and allows manufacturers to be much more able to respond quickly to market requirements.
The DRL approach presents a superb stopgap measure for the next twenty years or so. Until the next ‘seismic shift’ in the way we physically create logic circuits companies must innovate chip architectures beyond the simple ‘throw more transistors at it’ mentality.