high performance computing on graphics processing units: hgpu.org

hgpu.org » ATI Radeon HD 6550

Towards Enhancing Performance, Programmability, and Portability in Heterogeneous Computing

Konstantinos Krommydas

View

Download (PDF)

Source codes

Tags: ATI, ATI Radeon HD 6550, ATI Radeon HD 7660, ATI Radeon HD 7970, Code generation, Compilers, Computer science, FPGA, Heterogeneous systems, Intel Xeon Phi, nVidia, OpenCL, Package, Performance, Tesla K20, Thesis

May 9, 2017 by hgpu

Improving the Performance of the Contextual Spaces Re-Ranking Algorithm on Heterogeneous Systems

Flavia Pisani, Daniel C. G. Pedronette, Ricardo da S. Torres, Edson Borin

View

Download (PDF)

Tags: Algorithms, ATI, ATI Radeon HD 6550, Computer science, Heterogeneous systems, List ranking, OpenCL

September 5, 2016 by hgpu

OpenDwarfs: Characterization of Dwarf-Based Benchmarks on Fixed and Reconfigurable Architectures

Konstantinos Krommydas, Wu-chun Feng, Christos D. Antonopoulos, Nikolaos Bellas

View

Download (PDF)

Source codes

Tags: ATI, ATI Radeon HD 6550, ATI Radeon HD 7660, ATI Radeon HD 7970, Computer science, FPGA, Heterogeneous systems, Intel Xeon Phi, OpenCL, Package

December 22, 2015 by hgpu

Behavioral Non-portability in Scientific Numeric Computing

Yijia Gu, Thomas Wahl, Mahsa Bayati, Miriam Leeser

View

Download (PDF)

Tags: Algorithms, ATI, ATI Radeon HD 6550, Computer science, Heterogeneous systems, nVidia, nVidia Quadro 600, OpenCL, Tesla C2075, Tesla K20

December 12, 2015 by hgpu

Generating Efficient Data Movement Code for Heterogeneous Architectures with Distributed-Memory

Roshan Dathathri, Chandan Reddy, Thejas Ramashekar, Uday Bondhugula

View

Download (PDF)

Tags: ATI, ATI FirePro V4800, ATI Radeon HD 6550, Code generation, Computer science, Heterogeneous systems, nVidia, Tesla C2050

July 1, 2013 by hgpu

A Framework for Profiling and Performance Monitoring of Heterogeneous Applications

Perhaad Mistry, Yash Ukidave, Dana Schaa, David Kaeli

View

Download (PDF)

Tags: ATI, ATI Radeon HD 5870, ATI Radeon HD 6550, Computer science, Computer vision, Heterogeneous systems, OpenCL, Performance, Signal processing

April 17, 2013 by hgpu

Load Balancing in a Changing World: Dealing with Heterogeneity and Performance Variability

Michael Boyer, Kevin Skadron, Shuai Che, Nuwan Jayasena

View

Download (PDF)

Tags: ATI, ATI Radeon HD 6550, ATI Radeon HD 6670, Computer science, Heterogeneous systems, OpenCL, Performance, Task scheduling

April 8, 2013 by hgpu

Parallel Sorting on the Heterogeneous AMD Fusion Accelerated Processing Unit

Michael Christopher Delorme

View

Download (PDF)

Tags: APU, ATI, ATI Radeon HD 5870, ATI Radeon HD 6530, ATI Radeon HD 6550, Computer science, Heterogeneous systems, nVidia, OpenCL, Sorting, Thesis, Tutorial

March 26, 2013 by hgpu

Developing Performance-Portable Molecular Dynamics Kernels in OpenCL

S. J. Pennycook, S. A. Jarvis

View

Download (PDF)

Source codes

Tags: ATI, ATI FirePro V7800, ATI Radeon HD 6550, Benchmarking, Computer science, Molecular dynamics, nVidia, nVidia GeForce GTX 680, OpenCL, Package, Performance, Tesla C1060, Tesla C2050

February 14, 2013 by hgpu

Accelerating MapReduce on a coupled CPU-GPU architecture

Linchuan Chen, Xin Huo, Gagan Agrawal

View

Download (PDF)

Tags: APU, ATI, ATI Radeon HD 6550, Computer science, Data mining, Heterogeneous systems, Machine learning, MapReduce, OpenCL

November 20, 2012 by hgpu

A Heterogeneous Accelerated Matrix Multiplication: OpenCL + APU + GPU+ Fast Matrix Multiply

Paolo D'Alberto

View

Download (PDF)

Tags: ATI, ATI Radeon HD 6550, ATI Radeon HD 6750, Computer science, Heterogeneous systems, Mathematical Software, Matrix multiplication, OpenCL

May 16, 2012 by hgpu

iGPU: Exception Support and Speculative Execution on GPUs

Jaikrishnan Menon, Marc de Kruijf, Karthikeyan Sankaralingam

View

Download (PDF)

Tags: ATI, ATI Radeon HD 6550, Computer science, nVidia, nVidia GeForce GTX 580, OpenCL, Performance, Programming techniques

May 7, 2012 by hgpu

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

SimSYCL: A SYCL Implementation Targeting Development, Debugging, Simulation and Conformance

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

94% on CIFAR-10 in 3.29 Seconds on a Single GPU

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers

OpenMC Monte Carlo Code

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs

Polygeist: C/C++ frontend for MLIR

Retargeting and Respecializing GPU Workloads for Performance Portability

Parallel Gaussian process with kernel approximation in CUDA

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Towards Enhancing Performance, Programmability, and Portability in Heterogeneous Computing

OpenDwarfs: Characterization of Dwarf-Based Benchmarks on Fixed and Reconfigurable Architectures

Behavioral Non-portability in Scientific Numeric Computing

Generating Efficient Data Movement Code for Heterogeneous Architectures with Distributed-Memory

A Framework for Profiling and Performance Monitoring of Heterogeneous Applications

Load Balancing in a Changing World: Dealing with Heterogeneity and Performance Variability

Parallel Sorting on the Heterogeneous AMD Fusion Accelerated Processing Unit

Developing Performance-Portable Molecular Dynamics Kernels in OpenCL

Accelerating MapReduce on a coupled CPU-GPU architecture

A Heterogeneous Accelerated Matrix Multiplication: OpenCL + APU + GPU+ Fast Matrix Multiply

iGPU: Exception Support and Speculative Execution on GPUs

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)