Hartwig Anzt, Piotr Luszczek, Jack Dongarra, Vincent Heuveline
Kristian Bredies, Martin Holler
Jeff Pool, Anselmo Lastra, Montek Singh
Chidiebere Okwudire, Martin Palatnik, Xu Zhang, Tanya Kudchadker
Takayuki Aoki, Satoi Ogawa, Akinori Yamanaka
Davor Davidovic, Enrique S. Quintana-Orti
Mario Mendez-Lojo, Martin Burtscher, Keshav Pingali
Yifeng Chen, Xiang Cui, Hong Mei
Tags: Code generation, Computer science, CUDA, FFT, GPU cluster, Heterogeneous systems, MPI, nVidia, Optimization, Performance, Programming techniques, Pthreads, Tesla C1060