To enable the design of large sized caches, novel memory technologies (such as non-volatile memory) and novel fabrication approaches (e.g. 3D stacking) have been explored. The existing modeling tools, however, cover only few memory technologies, CMOS technology nodes and fabrication approaches. We present DESTINY, a tool for modeling 3D (and 2D) cache designs using SRAM, […]
February 22, 2015 by sparsh0mittal
Recent trends of CMOS technology scaling and wide-spread use of multicore processors have dramatically increased the power consumption of main memory. It has been estimated that modern data-centers spend more than 30% of their total power consumption in main memory alone. This excessive power dissipation has created the problem of “memory power wall”; which has […]
February 9, 2015 by sparsh0mittal
With growing use of internet and exponential growth in amount of data to be stored and processed (known as ‘big data’), the size of data centers has greatly increased. This, however, has resulted in significant increase in the power consumption of the data centers. For this reason, managing power consumption of data centers has become […]
February 8, 2015 by sparsh0mittal
Modern processors are using increasingly larger sized on-chip caches. Also, with each CMOS technology generation, there has been a significant increase in their leakage energy consumption. For this reason, cache power management has become a crucial research issue in modern processor design. To address this challenge and also meet the goals of sustainable computing, researchers […]
January 19, 2015 by sparsh0mittal
There has recently been much interest in stream processing, both in industry (e.g., Cell, NVIDIA G80, ATI R580) and academia (e.g., Stanford Merrimac, MIT RAW), with stream programs becoming increasingly popular for both media and more general-purpose computing. Although a special style of programming called stream programming is needed to target these stream architectures, huge […]
January 26, 2011 by hgpu
This paper presents the many-core architecture, with hundreds to thousands of small cores, to deliver unprecedented compute performance in an affordable power envelope. We discuss fine grain power management, memory bandwidth, on die networks, and system resiliency for the many-core system.
January 7, 2011 by hgpu
Graphics processors (GPU) offer the promise of more than an order of magnitude speedup over conventional processors for certain non-graphics computations. Because the GPU is often presented as a C-like abstraction (e.g., Nvidia’s CUDA), little is known about the characteristics of the GPU’s architecture beyond what the manufacturer has documented. This work develops a microbechmark […]
December 21, 2010 by hgpu
As the line between GPUs and CPUs begins to blur, it’s important to understand what makes GPUs tick.
December 20, 2010 by hgpu
We present and evaluate the TILA-rin GPU microarchitecture for embedded systems using the ATTILA GPU simulation framework. We use a trace from an execution of the Unreal Tournament 2004 PC game to eval uate and compare the performance of the proposed embedded GPU against a baseline GPU architecture for the PC. We evaluate the different […]
December 18, 2010 by hgpu
The HipHaC workshop aims at combining new aspects of parallel, heterogeneous, and reconfigurable microprocessor technologies with concepts of high-performance computing and, particularly, numerical solution methods. Compute- and memory-intensive applications can only benefit from the full hardware potential if all features on all levels are taken into account in a holistic approach.
December 14, 2010 by hgpu
Computer systems are undergoing significant change: to improve performance and efficiency, architects are exposing more microarchitectural details directly to programmers. Software that exploits specialized accelerators, such as GPUs, and specialized processor features, such as software-controlled memory, exposes limitations in existing compiler and OS infrastructure. In this paper we propose a pragmatic approach, motivated by our […]
December 13, 2010 by hgpu
Accelerators are special purpose processors designed to speed up compute-intensive sections of applications. Two extreme endpoints in the spectrum of possible accelerators are FPGAs and GPUs, which can often achieve better performance than CPUs on certain workloads. FPGAs are highly customizable, while GPUs provide massive parallel execution resources and high memory bandwidth. Applications typically exhibit […]
December 13, 2010 by hgpu