ARM upgrades CoreLink IP to 16 core/1Tbps
ARM’s lead licensees are using an upgrade of ARM’s CoreLink CCN-504 cache coherent network which supports up to 16 A-15 cores and delivers up to Tbit/sec usable system bandwidth.
General access is in H1 2013 and production systems are expected in 2014.
The IP is also designed for next generation ARMv8 64-bit cores which will be licensed later this year.
ARM has also produced a CoreLink DMC-520 dynamic memory controller designed and optimised to work with the CoreLink CCN-504.
The controller provides an interface to shared off-chip memory, such as DDR3, DDR3L and DDR4 DRAM. It is part of an integrated ARM DDR4 interface solution incorporating ARM Artisan® DDR4/3 PHY IP planned for introduction in 2013.
CoreLink CCN-504 is the first in a family of products.? It enables a fully-coherent, high-performance many-core solution that supports up to 16 cores on the same silicon die. It enables system coherency in heterogeneous multicore and multi-cluster CPU/GPU systems by enabling each processor in the system to access the other processor caches.
This reduces the need to access off-chip memory, saving time and energy, which is a key enabler in systems based on ARM big.LITTLE processing.
“As the amount of data used increases exponentially over the next 10-15 years, the CoreLink CCN-504 and DMC-520 will play an important role by providing high-performance system IP solutions for many-core applications,” said Tom Cronk, deputy general manager, processor division, ARM. “This ensures quality of service and coherent operation across the system, and enables SoC designers to efficiently prioritize and handle wide data flows with optimum latency.”?
The CoreLink CCN-504 is the first in a family of network-based interconnect products planned by ARM.
Building on the AMBA 4 AC spec, the CoreLink CCN-504 enables improved energy-efficiency and lower latency than software coherency. Over 8000 AMBA 4 ACE specifications have been downloaded to date.
The CoreLink CCN-504 cache coherent network includes integrated level 3 (L3) cache and snoop filter functions. The L3 cache, which is configurable up to 16MB, extends on-chip caching for demanding workloads and offers low latency on-chip memory for allocation and sharing of data between processors, high-speed IO interfaces and accelerators. The snoop filter removes the need for broadcast coherency messaging, further reducing latency and power.