high performance computing on graphics processing units: hgpu.org

hgpu.org » nVidia GeForce GTX 590

Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms

Leiming Yu, Fanny Nina-Paravecino, David Kaeli, Qianqian Fang

View

Download (PDF)

Source codes

Tags: Algorithms, AMD R9 Nano, AMD Radeon RX 480, ATI, Computational Physics, Heterogeneous systems, nVidia, nVidia GeForce GTX 1050 Ti, nVidia GeForce GTX 1080 Ti, nVidia GeForce GTX 590, nVidia GeForce GTX 980 Ti, nVidia GeForce GTX Titan X, OpenCL, Package, Physics

November 12, 2017 by hgpu

Multi-Tasking Scheduling for Heterogeneous Systems

Yuan Wen

View

Download (PDF)

Tags: ATI, ATI Radeon HD 7970, Computer science, Heterogeneous systems, Machine learning, nVidia, nVidia GeForce GTX 590, nVidia GeForce GTX 780, OpenCL, Task scheduling, Thesis

September 7, 2017 by hgpu

GPU Array Access Auto-Tuning

Nicolas Weber

View

Download (PDF)

Tags: Computer science, CUDA, nVidia, nVidia GeForce GT 440, nVidia GeForce GT 620, nVidia GeForce GT 730, nVidia GeForce GTX 1080, nVidia GeForce GTX 480, nVidia GeForce GTX 560 Ti, nVidia GeForce GTX 570, nVidia GeForce GTX 590, nVidia GeForce GTX 680, nVidia GeForce GTX 780, nVidia GeForce GTX 980, nVidia GeForce GTX Titan X, Performance, performance portability, Tesla C2070, Tesla K20, Thesis

August 8, 2017 by hgpu

MapSQ: A MapReduce-based Framework for SPARQL Queries on GPU

Jiaying Feng, Xiaowang Zhang, Zhiyong Feng

View

Download (PDF)

Tags: Algorithms, Benchmarking, Computer science, CUDA, Databases, MapReduce, nVidia, nVidia GeForce GTX 590

February 18, 2017 by hgpu

VirtCL: a framework for OpenCL device abstraction and management

Yi-Ping You, Hen-Jung Wu, Yeh-Ning Tsai, Yen-Ting Chao

View

Download (PDF)

Tags: Computer science, Heterogeneous systems, Memory model, nVidia, nVidia GeForce GTX 580, nVidia GeForce GTX 590, OpenCL, Performance

February 23, 2016 by hgpu

A framework for efficient execution on GPU and CPU+GPU systems

Jean-Francois Dollinger

View

Download (PDF)

Tags: Code generation, Computer science, CUDA, Heterogeneous systems, nVidia, nVidia GeForce GTX 590, nVidia GeForce GTX 680, Performance, Thesis

January 19, 2016 by hgpu

Autotuning Stencils Codes with Algorithmic Skeletons

Chris Cummins

View

Download (PDF)

Source codes

Tags: Algorithms, ATI, ATI Radeon HD 7970, Computer science, Heterogeneous systems, Machine learning, nVidia, nVidia GeForce GTX 590, nVidia GeForce GTX 690, nVidia GeForce GTX Titan, OpenCL, Package

December 19, 2015 by hgpu

Autotuning OpenCL Workgroup Size for Stencil Patterns

Chris Cummins, Pavlos Petoumenos, Michel Steuwer, Hugh Leather

View

Download (PDF)

Tags: ATI, ATI Radeon HD 7970, Computer science, Machine learning, nVidia, nVidia GeForce GTX 590, nVidia GeForce GTX 690, nVidia GeForce GTX Titan, OpenCL, Performance

November 11, 2015 by hgpu

A Parallel Implementation of the Self Organising Map using OpenCL

Gavin Davidson

View

Download (PDF)

Source codes

Tags: Algorithms, Computer science, Machine learning, nVidia, nVidia GeForce GTX 590, OpenCL, Package, Self-organizing map

August 11, 2015 by hgpu

Parallel Unsteady Flow Line Integral Convolution for High-Performance Dense Visualization

Zi'ang Ding, Zhanping Liu, Yang Yu, Wei Chen

View

Download (PDF)

Tags: CUDA, Fluid dynamics, Image generation, nVidia, nVidia GeForce GTX 590, Visualization

March 28, 2015 by hgpu

Multi-GPU implementation of a VMAT treatment plan optimization algorithm

Zhen Tian, Fei Peng, Michael Folkerts, Jun Tan, Xun Jia, Steve B. Jiang

View

Download (PDF)

Tags: CUDA, Medical Physics, nVidia, nVidia GeForce GTX 590, Physics

March 6, 2015 by hgpu

Performance Improvement of Multichannel Audio by Graphics Processing Units

Jose Antonio Belloch Rodriguez

View

Download (PDF)

Tags: Acoustics, Algorithms, Computer science, CUDA, Databases, Filtering, nVidia, nVidia GeForce GTS 360 M, nVidia GeForce GTX 590, nVidia GeForce GTX 690, nVidia Quadro FX 5800, OpenCL, Rendering, Signal processing, Tesla C2070, Tesla K20, Thesis

October 11, 2014 by hgpu

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

SimSYCL: A SYCL Implementation Targeting Development, Debugging, Simulation and Conformance

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

94% on CIFAR-10 in 3.29 Seconds on a Single GPU

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers

OpenMC Monte Carlo Code

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs

Polygeist: C/C++ frontend for MLIR

Retargeting and Respecializing GPU Workloads for Performance Portability

Parallel Gaussian process with kernel approximation in CUDA

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms

Multi-Tasking Scheduling for Heterogeneous Systems

GPU Array Access Auto-Tuning

MapSQ: A MapReduce-based Framework for SPARQL Queries on GPU

VirtCL: a framework for OpenCL device abstraction and management

A framework for efficient execution on GPU and CPU+GPU systems

Autotuning Stencils Codes with Algorithmic Skeletons

Autotuning OpenCL Workgroup Size for Stencil Patterns

A Parallel Implementation of the Self Organising Map using OpenCL

Parallel Unsteady Flow Line Integral Convolution for High-Performance Dense Visualization

Multi-GPU implementation of a VMAT treatment plan optimization algorithm

Performance Improvement of Multichannel Audio by Graphics Processing Units

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)