6245

Performance Analysis and Benchmarking of the Intel SCC

Philipp Gschwandtner, Thomas Fahringer, Radu Prodan
University of Innsbruck, Technikerstrasse 21a, Innsbruck, Austria
IEEE International Conference on Cluster Computing (CLUSTER), 2011

@inproceedings{gschwandtner2011performance,

   title={Performance Analysis and Benchmarking of the Intel SCC},

   author={Gschwandtner, P. and Fahringer, T. and Prodan, R.},

   booktitle={Cluster Computing (CLUSTER), 2011 IEEE International Conference on},

   pages={139–149},

   year={2011},

   organization={IEEE}

}

Download Download (PDF)   View View   Source Source   

764

views

There has been a continuous change over the past years in CPU design and development towards both power-aware hardware architectures as well as many-core processors. The Intel Single-chip Cloud Computer (SCC) combines those two trends. It is an experimental prototype created by Intel Labs consisting of 48 Pentium cores. The SCC is a highly configurable many-core chip that provides unique opportunities to optimize run time, communication and memory access as well as power and energy consumption of parallel programs. The aim of this paper is to analyze and characterize the performance behavior of the chip nuder various power settings, mappings of processes to cores and memory controllers as well as different techniques for data exchange between cores through benchmarking. The results are verified and interpreted by the use of analytical models as well as benchmarking kernels and a scientific application. Conclusions drawn from the results of our benchmarks confirm our architecture-derived hypothesis that data exchange based on shared memory is slower compared to using a message passing scheme. Furthermore contrary to popular belief, lowest energy consumption is not achieved for the fastest execution time but rather for a medium frequency/voltage setting, depending on the program being executed. Moreover in order to improve the memory access behavior it is more beneficial to increase the clock frequency of both, mesh network and memory controllers, compared to just increasing the clock of one of the two entities. In general, the results of our investigations can be used to analyze the effect of power settings and architecture properties on the performance and energy consumption of parallel programs as well as assist in choosing appropriate settings for specific workloads. Hence, our findings serve as a guidance for developers on how to effectively use the architectural characteristics of the SCC.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: