8454

cphVB: A System for Automated Runtime Optimization and Parallelization of Vectorized Applications

Mads Ruben Burgdorff Kristensen, Simon Andreas Frimann Lund, Troels Blum, Brian Vinter
University of Copenhagen
arXiv:1210.7774 [cs.PL] (26 Oct 2012)

@article{2012arXiv1210.7774R,

   author={Ruben Burgdorff Kristensen}, M. and {Frimann Lund}, S.~A. and {Blum}, T. and {Vinter}, B.},

   title={"{cphVB: A System for Automated Runtime Optimization and Parallelization of Vectorized Applications}"},

   journal={ArXiv e-prints},

   archivePrefix={"arXiv"},

   eprint={1210.7774},

   primaryClass={"cs.PL"},

   keywords={Computer Science – Programming Languages, Computer Science – Distributed, Parallel, and Cluster Computing},

   year={2012},

   month={oct},

   adsurl={http://adsabs.harvard.edu/abs/2012arXiv1210.7774R},

   adsnote={Provided by the SAO/NASA Astrophysics Data System}

}

Download Download (PDF)   View View   Source Source   Source codes Source codes

Package:

943

views

Modern processor architectures, in addition to having still more cores, also require still more consideration to memory-layout in order to run at full capacity. The usefulness of most languages is deprecating as their abstractions, structures or objects are hard to map onto modern processor architectures efficiently. The work in this paper introduces a new abstract machine framework, cphVB, that enables vector oriented high-level programming languages to map onto a broad range of architectures efficiently. The idea is to close the gap between high-level languages and hardware optimized low-level implementations. By translating high-level vector operations into an intermediate vector bytecode, cphVB enables specialized vector engines to efficiently execute the vector operations. The primary success parameters are to maintain a complete abstraction from low-level details and to provide efficient code execution across different, modern, processors. We evaluate the presented design through a setup that targets multi-core CPU architectures. We evaluate the performance of the implementation using Python implementations of well-known algorithms: a jacobi solver, a kNN search, a shallow water simulation and a synthetic stencil simulation. All demonstrate good performance.
Rating: 2.5. From 1 vote.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: