high performance computing on graphics processing units: hgpu.org

hgpu.org » ATI IL

Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level

Anthony Gutierrez, Bradford M. Beckmann, Alexandru Dutu, Joseph Gross, John Kalamatianos, Onur Kayiran, Michael LeBeane, Matthew Poremba, Brandon Potter, Sooraj Puthoor, Matthew D. Sinclair, Mark Wyse, Jieming Yin, Xianwei Zhang, Akshay Jain, Timothy G. Rogers

View

Tags: ATI IL, Computer science, HSA

February 10, 2018 by hgpu

Device specialization in heterogeneous multi-GPU environments

Gabriele Cocco, Antonio Cisternino

View

Tags: AMD Fusion, ATI, ATI IL, ATI Radeon HD 5870, Computer science, GPU cluster, Heterogeneous systems, Task scheduling

November 15, 2012 by hgpu

Implementation of a Parallel Tree Method on a GPU

Naohito Nakasato

View

Tags: Astrophysics, ATI, ATI IL, ATI Radeon HD 5870, Computational Complexity, Computer science, Galaxy Astrophysics, Instrumentation and Methods for Astrophysics, KD-tree, OpenCL, Optimization, Performance

December 21, 2011 by hgpu

A Complete Descritpion of the UnPython and Jit4GPU Framework

Rahul Garg, Jose Nelson Amaral

View

Tags: ATI, ATI CAL, ATI IL, ATI Radeon HD 5850, ATI Stream, Compilers, Computer science, OpenMP, Optimization, Performance, Python

October 2, 2011 by hgpu

A fast GEMM implementation on the cypress GPU

Naohito Nakasato

View

Tags: ATI, ATI CAL, ATI IL, ATI Radeon HD 5870, Computer science, Linear Algebra, Matrix multiplication, Performance

August 22, 2011 by hgpu

Caracal: dynamic translation of runtime environments for GPUs

Rodrigo Dominguez, Dana Schaa, David Kaeli

View

Tags: ATI, ATI CAL, ATI IL, ATI Radeon HD 5870, Computer science, CUDA, nVidia, nVidia GeForce GTX 480, OpenCL, Package, Performance, Programming Languages, PTX

August 19, 2011 by hgpu

FPGA and GPU implementation of large scale SpMV

Yi Shan, Tianji Wu, Yu Wang, Bo Wang, Zilong Wang, Ningyi Xu, Huazhong Yang

View

Tags: ATI, ATI CAL, ATI IL, ATI Radeon HD 5870, ATI Stream, Computer science, FPGA, Sparse matrix

July 10, 2011 by hgpu

A compiler for high performance computing with many-core accelerators

Naohito Nakasato, Jun Makino

View

Tags: ATI, ATI CAL, ATI IL, Compilers, Computer science, RV770

June 15, 2011 by hgpu

A Micro-benchmark Suite for AMD GPUs

Ryan Taylor, Xiaoming Li

View

Tags: ATI, ATI IL, ATI Stream, Benchmarking, Computer science, Presentation, Review, RV770, RV870

May 30, 2011 by hgpu

Making Human Connectome Faster: GPU Acceleration of Brain Network Analysis

Di Wu, Tianji Wu, Yi Shan, Yu Wang, Yong He, Ningyi Xu, Huazhong Yang

View

Tags: ATI, ATI IL, ATI Radeon HD 5870, ATI Stream, Medicine, Neurons and Cognition, Neuroscience

April 26, 2011 by hgpu

Efficient PageRank and SpMV Computation on AMD GPUs

Tianji Wu, Bo Wang, Yi Shan, Feng Yan, Yu Wang, Ningyi Xu

View

Tags: ATI, ATI CAL, ATI IL, ATI Radeon HD 5870, ATI Stream, Computer science, List ranking, OpenCL, Sparse matrix

April 1, 2011 by hgpu

A Fast GEMM Implementation On a Cypress GPU

Naohito Nakasato

View

Tags: ATI, ATI CAL, ATI IL, ATI Radeon HD 5870, ATI Stream, Computer science, Linear Algebra

March 18, 2011 by hgpu

QArray

QArray: a GPU-accelerated constant capacitance model simulator for large quantum dot arrays

Celerity: High-level C++ for Accelerator Clusters

Balancing Tracking Granularity and Parallelism in Many-Task Systems: The Horizons Approach

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

94% on CIFAR-10 in 3.29 Seconds on a Single GPU

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers

OpenMC Monte Carlo Code

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs

Polygeist: C/C++ frontend for MLIR

Retargeting and Respecializing GPU Workloads for Performance Portability

Parallel Gaussian process with kernel approximation in CUDA

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

SYCL in the edge: performance and energy evaluation for heterogeneous acceleration

OpenMP5-Offload-OpenMC-Intel-PVC

Distributed OpenMP Offloading of OpenMC on Intel GPU MAX Accelerators

See all packages

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Login | Sitemap | Feedback | Policy

Contact us: