Platform 2012, a Many-Core Computing Accelerator for Embedded SoCs: Performance Evaluation of Visual Analytics Applications

Diego Melpignano, Luca Benini, Eric Flamand, Bruno Jego, Thierry Lepley, Germain Haugou, Fabien Clermidy, Denis Dutoit
STMicroelectronics – AST, Grenoble, France
49th Annual Design Automation Conference (DAC ’12), 2012


   author={Melpignano, Diego and Benini, Luca and Flamand, Eric and Jego, Bruno and Lepley, Thierry and Haugou, Germain and Clermidy, Fabien and Dutoit, Denis},

   title={Platform 2012, a many-core computing accelerator for embedded SoCs: performance evaluation of visual analytics applications},

   booktitle={Proceedings of the 49th Annual Design Automation Conference},

   series={DAC ’12},



   location={San Francisco, California},







   address={New York, NY, USA},

   keywords={3D stacking, SoC, computer vision, feature extraction, low-power, many-core, process aware}


Download Download (PDF)   View View   Source Source   



P2012 is an area- and power-efficient many-core computing accelerator based on multiple globally asynchronous, locally synchronous processor clusters. Each cluster features up to 16 processors with independent instruction streams sharing a multi-banked one-cycle access L1 data memory, a multi-channel DMA engine and specialized hardware for synchronization and aggressive power management. P2012 is 3D stacking ready and can be customized to achieve extreme area and energy efficiency by adding domain-specific HW IPs to the cluster. The first P2012 SoC prototype in 28nm CMOS will sample in Q3, featuring four 16-processor clusters, a 1MB L2 memory and delivering 80GOPS (with 32 bit single precision floating point support) in 18mm2 with 2W power consumption (worst-case). P2012 can run standard OpenCL and proprietary Native Programming Model SW components to achieve the highest level of control on application-to-resource mapping. A dedicated version of the OpenCV vision library is provided in the P2012 SW Development Kit to enable visual analytics acceleration. This paper will discuss preliminary performance measurements of common feature extraction and tracking algorithms, parallelized on P2012, versus sequential execution on ARM CPUs.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: