Scaling CUDA for Distributed Heterogeneous Processors

Siu Kwan Lam
San Jose State University
San Jose State University, 2012


   title={Scaling CUDA for Distributed Heterogeneous Processors},

   author={Lam, S.K.},



Download Download (PDF)   View View   Source Source   Source codes Source codes




The mainstream acceptance of heterogeneous computing and cloud computing is prompting a future of distributed heterogeneous systems. With current software development tools, programming such complex systems is difficult and requires an extensive knowledge of network and processor architectures. Providing an abstraction of the underlying network, message-passing interface (MPI) has been the standard tool for developing distributed applications in the high performance community. The problem of MPI lies with its message-passing model, which is less expressive than the shared-memory model. Development of heterogeneous programming tools, such as OpenCL, has only begun recently. This thesis presents Phalanx, a framework that extends the virtual architecture of CUDA for distributed heterogeneous systems. Using MPI, Phalanx transparently handles intercommunication among distributed nodes. By using the shared-memory model, Phalanx simplifies the development of distributed applications without sacrificing the advantages of MPI. In one of the case studies, Phalanx achieves 28x speedup compared with serial execution on a Core-i7 processor.
No votes yet.
Please wait...

* * *

* * *

* * *

HGPU group © 2010-2022 hgpu.org

All rights belong to the respective authors

Contact us: