Exploiting Task-Parallelism on GPU Clusters via OmpSs and rCUDA Virtualization

Adrian Castello, Rafael Mayo, Judit Planas, Enrique S. Quintana-Orti
Depto. de Ingenieria y Ciencia de Computadores, Universidad Jaume I, 12071-Castellon, Spain
1st IEEE Int. Workshop on Reengineering for Parallelism in Heterogeneous Parallel Platforms (RePara), 2015


   title={Exploiting Task-Parallelism on GPU Clusters via OmpSs and rCUDA Virtualization},

   author={Castell{‘o}, Adri{‘a}n and Mayo, Rafael and Planas, Judit and Quintana-Ort{i}, Enrique S},



Download Download (PDF)   View View   Source Source   



OmpSs is a task-parallel programming model consisting of a reduced collection of OpenMP-like directives, a front-end compiler, and a runtime system. This directive-based programming interface helps developers accelerate their application’s execution, e.g. in a cluster equipped with graphics processing units (GPUs), with a low programming effort. On the other hand, the virtualization package rCUDA provides seamless and transparent remote access to any CUDA GPU in a cluster, via the CUDA Driver and Runtime programming interfaces. In this paper we investigate the hurdles and practical advantages of combining these two technologies. Our experimental study targets two cluster configurations: a system where all the GPUs are located into a single cluster node; and a cluster with the GPUs distributed among the nodes. Two applications, the Nbody particle simulation and the Cholesky factorization of a dense matrix, are employed to expose the bottlenecks and performance of a remote virtualization solution applied to these two OmpSs task-parallel codes.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: