high performance computing on graphics processing units: hgpu.org

Programming

hgpu.org » Programming

An In-depth Performance Characterization of CPU- and GPU-based DNN Training on Modern Architectures

Ammar Ahmad Awan, Hari Subramoni, Dhabaleswar K. Panda

View

Tags: Benchmarking, Caffe, Computer science, CUBLAS, CUDA, Deep learning, Intel Xeon Phi, Machine learning, nVidia, Tela K40, Tesla K80, Tesla P100

December 24, 2017 by hgpu

GAMER-2: a GPU-accelerated adaptive mesh refinement code — accuracy, performance, and scalability

Hsi-Yu Schive, John A. ZuHone, Nathan J. Goldbaum, Matthew J. Turk, Massimo Gaspari, Chin-Yu Cheng

View

Download (PDF)

Source codes

Tags: ARM, Astrophysics, Chemistry, CUDA, Instrumentation and Methods for Astrophysics, Magnetohydrodynamics, MPI, nVidia, OpenMP, Package, Tesla K20, Tesla P100

December 24, 2017 by hgpu

Molecular dynamics recipes for genome research

Tommaso Biagini, Giovanni Chillemi, Gianluigi Mazzoccoli, Alessandro Grottesi, Caterina Fusilli, Daniele Capocefalo, Stefano Castellana, Angelo Luigi Vescovi, Tommaso Mazza

View

Download (PDF)

Tags: Benchmarking, Bioinformatics, Biology, Chemistry, CUDA, Genomics, Molecular dynamics, nVidia, OpenCL, Physics, Tesla C2070

December 19, 2017 by hgpu

Accelerated Sparse Matrix Operations in Nonlinear Least Squares Solvers

Lukas Polok

View

Download (PDF)

Tags: Algorithms, Computer science, Differential equations, Factorization, FEM, Finite element method, Linear Algebra, nVidia, nVidia GeForce GTX 680, OpenCL, Partial differential equations, PDEs, Sparse matrix, Tesla K40, Thesis

December 19, 2017 by hgpu

OpenCL-accelerated Point Feature Histogram and Its Application in Railway Track Point Cloud Data Processing

Dongxu Lv, Peijun Wang, Wentao Li, Peng Chen

View

Download (PDF)

Tags: Algorithms, AMD Radeon R7 260, ATI, Computer science, OpenCL

December 19, 2017 by hgpu

Improving 3D Lattice Boltzmann Method stencil with asynchronous transfers on many-core processors

Minh Quan Ho, Christian Obrecht, Bernard Tourancheau, Benoit Dupont de Dinechin, Julien Hascoet

View

Download (PDF)

Tags: cfd, Fluid dynamics, Lattice Boltzmann model, OpenCL, Stencil computation

December 19, 2017 by hgpu

Effective Extensible Programming: Unleashing Julia on GPUs

Tim Besard, Christophe Foket, Bjorn De Sutter

View

Download (PDF)

Source codes

Tags: Code generation, Computer science, CUDA, High-level Languages, LLVM, nVidia, nVidia GeForce GTX 1080, Package, Programming Languages

December 15, 2017 by hgpu

Intra-node Memory Safe GPU Co-Scheduling

Carlos Reano, Federico Silla, Dimitrios S. Nikolopoulos, Blesson Varghese

View

Download (PDF)

Source codes

Tags: Computer science, CUDA, nVidia, Package, Task scheduling, Tesla K20

December 15, 2017 by hgpu

Task Scheduling for Heterogeneous Multicore Systems

Zhuo Chen, Diana Marculescu

View

Download (PDF)

Tags: Computer science, Heterogeneous systems, nVidia, nVidia GeForce GTX 760 Ti, OpenCL, Task scheduling

December 15, 2017 by hgpu

Investigating Half Precision Arithmetic to Accelerate Dense Linear System Solvers

Azzam Haidar, Panruo Wu, Stanimire Tomov, Jack Dongarra

View

Download (PDF)

Tags: Algorithms, Artificial intelligence, BLAS, Computer science, Linar Algebra, Mixed precision, Neural networks, nVidia, Tesla P100

December 10, 2017 by hgpu

Acceleration of Cellular Automata through Parallel Computing with OpenCL

Maelso Bruno Pacheco Nunes Pereira, Christian Azambuja Pagot, Josue da Silva Gomes Junior, Jorge Gabriel Gomes de Souza Ramos, Tiago P. Nascimento, Alisson V. Brito

View

Download (PDF)

Tags: Cellular automata, Computer science, nVidia, nVidia Quadro 600, OpenCL

December 10, 2017 by hgpu