Enabling Machine Intelligence at High Performance & Low Power
Achieving high performance at ultra-low power is no ordinary challenge - Luckily, Primetic 2 is no ordinary processor.
Module Architecture and Development Platform
Deep Learning on the Primetic Platform
Current mobile processing architectures aren’t well-suited for the computer vision and deep neural network workloads, given the fundamental challenges in balancing performance, heat dissipation and power. In addition, Moore’s law is slowing down which is causing the power and performance benefits in transitioning to the next process technology node to decrease. We believe that the next decade will mark the start of a new era of special-purpose processors focused on decreasing the energy per operation in a new way.
The design principles for Primetic 2 followed from a number of improvements including an increase in the number of programmable vector-processors, and additional dedicated hardware accelerators. As a Image Processing Cores (IPCs) System-on-Module (SoM), Primetic 2 has a software-controlled, multi-core, multi-ported memory subsystem and caches which can be configured to allow a large range of workloads. Primetic 2 provides exceptionally high sustainable on-chip data and instruction bandwidth to support the twelve processors, 2 RISC processors and high-performance video hardware accelerators.
In order to deploy Deep Learning at the network edge and close to the sensors where data processing latency is lowest, performance and precision at very low power are still critical. The Primetic platform has a number of key elements suited to Deep Learning and convolutional neural network implementations in particular.
- On Chip RAM : deep networks create large volumes of intermediate data. Keeping all of this on Chip enables our customers to vastly reduce the bandwidth that would otherwise create performance bottlenecks.
- Flexible Precision : Native Support for Mixed Precision and Hardware Flexibility—the ability to support Deep Learning networks with industry-leading performance at best-in-class power efficiency is supported by Primetic’s flexibility in terms of mixed precision support. Both 16 bit and 32 bit floating point datatypes, as well as u8 and unorm8 types are supported. Additionally, existing hardware accelerators are easily repurposed to provide the flexibility needed to achieve high performance for convolution computation.
- High performance libraries : The development kit includes dedicated software libraries that go hand-in-hand with the architecture to support sustained performance on matrix multiplication and multidimensional convolution.
Device makers look to platforms that offer the performance, power, and cost that enable cutting edge network topologies at the network edge. The Primetic platform delivers the right architectural and software elements to usher in a new era of deep learning in new, groundbreaking devices.