high performance computing on graphics processing units: hgpu.org

hgpu.org » ATI Stream

An Optimized Large-Scale Hybrid DGEMM Design for CPUs and ATI GPUs

Jiajia Li, Xingjian Li, Guangming Tan, Mingyu Chen, Ninghui Sun

View

Download (PDF)

Source codes

Tags: Algorithms, ATI, ATI CAL, ATI Radeon HD 5970, ATI Stream, Computer science, Heterogeneous systems, Matrix multiplication, Optimization, Package

June 29, 2012 by hgpu

Acceleration of CFD and data analysis using graphics processors

Ali Khajeh Saeed

View

Download (PDF)

Tags: ATI, ATI FirePro V7800, ATI Stream, CUDA, Fluid dynamics, nVidia, Tesla C2070, Thesis

April 16, 2012 by hgpu

General purpose computing on graphics processing units using OpenCL

Mats Johansson, Oscar Winter

View

Download (PDF)

Tags: Algorithms, ATI, ATI Radeon HD 4870, ATI Stream, Computer science, CUDA, nVidia, nVidia GeForce 8800 GTS, nVidia GeForce GTX 480, OpenCL, Optical flow, Optimization, Thesis

October 12, 2011 by hgpu

A Complete Descritpion of the UnPython and Jit4GPU Framework

Rahul Garg, Jose Nelson Amaral

View

Download (PDF)

Tags: ATI, ATI CAL, ATI IL, ATI Radeon HD 5850, ATI Stream, Compilers, Computer science, OpenMP, Optimization, Performance, Python

October 2, 2011 by hgpu

Using many-core hardware to correlate radio astronomy signals

Rob V. van Nieuwpoort, John W. Romein

View

Download (PDF)

Tags: Algorithms, Astrophysics, ATI, ATI CAL, ATI Radeon HD 4870, ATI Stream, Cell processor, CUDA, nVidia, Performance, Physics, Signal processing, Tesla C1060

September 19, 2011 by hgpu

Towards GPGPU Assisted Computing in Virtualized Environments

Thilo Schmitt, Alexander Weggerle, Christian Himpel, Peter Schulthess

Tags: ATI, ATI Stream, Computer science, CUDA, nVidia, OpenCL, Review, Virtualization

September 8, 2011 by hgpu

A GPU Accelerated Algorithm for Compressive Sensing Based Image Super-Resolution

Xifei Wu, Hui Xiang, Peng Lu

Tags: Algorithms, ATI, ATI Stream, Compression, Computer science, Image reconstruction

September 2, 2011 by hgpu

FPGA and GPU implementation of large scale SpMV

Yi Shan, Tianji Wu, Yu Wang, Bo Wang, Zilong Wang, Ningyi Xu, Huazhong Yang

View

Download (PDF)

Tags: ATI, ATI CAL, ATI IL, ATI Radeon HD 5870, ATI Stream, Computer science, FPGA, Sparse matrix

July 10, 2011 by hgpu

Scientific and Engineering Computing Using ATI Stream Technology

Amr Bayoumi, Michael Chu, Yasser Hanafy, Patricia Harrell, Gamal Refai-Ahmed

View

Download (PDF)

Tags: AMD FireStream 9270, ATI, ATI Stream, Brook, Computer science, OpenMP, Programming techniques, Review

June 26, 2011 by hgpu

Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing

Canqun Yang, Feng Wang, Yunfei Du, Juan Chen, Jie Liu, Huizhan Yi, Kai Lu

View

Download (PDF)

Tags: ATI, ATI Radeon HD 4870, ATI Stream, Computer science, Heterogeneous systems, Linear Algebra, Optimization, Presentation

June 9, 2011 by hgpu

A Micro-benchmark Suite for AMD GPUs

Ryan Taylor, Xiaoming Li

View

Download (PDF)

Tags: ATI, ATI IL, ATI Stream, Benchmarking, Computer science, Presentation, Review, RV770, RV870

May 30, 2011 by hgpu

Making Human Connectome Faster: GPU Acceleration of Brain Network Analysis

Di Wu, Tianji Wu, Yi Shan, Yu Wang, Yong He, Ningyi Xu, Huazhong Yang

View

Download (PDF)

Tags: ATI, ATI IL, ATI Radeon HD 5870, ATI Stream, Medicine, Neurons and Cognition, Neuroscience

April 26, 2011 by hgpu

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

SimSYCL: A SYCL Implementation Targeting Development, Debugging, Simulation and Conformance

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

94% on CIFAR-10 in 3.29 Seconds on a Single GPU

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers

OpenMC Monte Carlo Code

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs

Polygeist: C/C++ frontend for MLIR

Retargeting and Respecializing GPU Workloads for Performance Portability

Parallel Gaussian process with kernel approximation in CUDA

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

An Optimized Large-Scale Hybrid DGEMM Design for CPUs and ATI GPUs

Acceleration of CFD and data analysis using graphics processors

General purpose computing on graphics processing units using OpenCL

A Complete Descritpion of the UnPython and Jit4GPU Framework

Using many-core hardware to correlate radio astronomy signals

Towards GPGPU Assisted Computing in Virtualized Environments

A GPU Accelerated Algorithm for Compressive Sensing Based Image Super-Resolution

FPGA and GPU implementation of large scale SpMV

Scientific and Engineering Computing Using ATI Stream Technology

Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing

A Micro-benchmark Suite for AMD GPUs

Making Human Connectome Faster: GPU Acceleration of Brain Network Analysis

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)