high performance computing on graphics processing units: hgpu.org

hgpu.org » ATI FirePro V7800

PyFAI: a Python library for high performance azimuthal integration on GPU

Jerome Kieffer, Giannis Ashiotis

View

Download (PDF)

Source codes

Tags: Astrophysics, ATI, ATI FirePro V7800, Instrumentation and Methods for Astrophysics, Intel Xeon Phi, nVidia, nVidia GeForce GTX 750 Ti, OpenCL, Package, PyOpenCL, Python, Tesla K20

December 22, 2014 by hgpu

Towards Portable Performance for Explicit Hydrodynamics Codes

A. C. Mallinson, D. A. Beckingsale, W. P. Gaudin, J. A. Herdman, S. A. Jarvis

View

Download (PDF)

Source codes

Tags: APU, ATI, ATI FirePro V7800, Computer science, Fluid dynamics, MPI, nVidia, OpenCL, Package, Performance, Tesla K20

January 14, 2014 by hgpu

OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures

Shuhao Zhang, Jiong He, Bingsheng He, Mian Lu

View

Download (PDF)

Tags: AMD, APU, ATI, ATI FirePro V7800, Computer science, Databases, nVidia Quadro K5000, OpenCL

July 12, 2013 by hgpu

Evaluating the Performance of Legacy Applications on Emerging Parallel Architectures

Simon John Pennycook

View

Download (PDF)

Source codes

Tags: Algorithms, ATI, ATI FirePro V7800, Benchmarking, Computer science, CUDA, MPI, nVidia, nVidia GeForce 8400 GS, nVidia GeForce 9800 GT, nVidia GeForce GTX 680, OpenCL, Performance, Tesla C1060, Tesla C2050, Thesis

May 21, 2013 by hgpu

An Investigation of the Performance Portability of OpenCL

S.J. Pennycook, S.D. Hammond, S.A. Wright, J.A. Herdman, I. Miller, S.A. Jarvis

View

Download (PDF)

Source codes

Tags: ATI, ATI FirePro V7800, Benchmarking, Computer science, CUDA, Fortran, MPI, nVidia, OpenCL, Package, Performance, Tesla C1060, Tesla C2050

May 21, 2013 by hgpu

Developing Performance-Portable Molecular Dynamics Kernels in OpenCL

S. J. Pennycook, S. A. Jarvis

View

Download (PDF)

Source codes

Tags: ATI, ATI FirePro V7800, ATI Radeon HD 6550, Benchmarking, Computer science, Molecular dynamics, nVidia, nVidia GeForce GTX 680, OpenCL, Package, Performance, Tesla C1060, Tesla C2050

February 14, 2013 by hgpu

PyFAI, a versatile library for azimuthal regrouping

Jerome Kieffer, Dimitrios Karkoulis

View

Download (PDF)

Source codes

Tags: ATI, ATI FirePro V7800, Computer science, nVidia, nVidia GeForce GTX 580, OpenCL, Package, Python, Tesla C2075

November 25, 2012 by hgpu

A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data

Ewa Niewiadomska-Szynkiewicz, Michal Marks, Jaroslaw Jantura, Mikolaj Podbielski

View

Download (PDF)

Tags: ATI, ATI FirePro V7800, ATI Radeon HD 6970, Computer science, Distributed computing, GPU cluster, nVidia, OpenCL, Security, Tesla M2050

October 19, 2012 by hgpu

Heterogeneous GPU&CPU cluster for High Performance Computing in cryptography

Michal Marks, Jaroslaw Jantura, Ewa Niewiadomska-Szynkiewicz, Przemyslaw Strzelczyk, Krzysztof Gozdz

View

Download (PDF)

Tags: Algorithms, AMD FirePro V7900, AMD FireStream 9350, ATI, ATI FirePro V5800, ATI FirePro V7800, Computer science, Distributed computing, GPU cluster, Heterogeneous systems, nVidia, OpenCL, Security, Tesla M2050

October 4, 2012 by hgpu

Acceleration of CFD and data analysis using graphics processors

Ali Khajeh Saeed

View

Download (PDF)

Tags: ATI, ATI FirePro V7800, ATI Stream, CUDA, Fluid dynamics, nVidia, Tesla C2070, Thesis

April 16, 2012 by hgpu

Ensemble K-means on multi-core architectures

Girish Ravunnikutty, Rejith George Joseph, Sanjay Ranka, Alin Dobra

View

Download (PDF)

Tags: Algorithms, ATI, ATI FirePro V7800, Clustering, Computer science, nVidia, nVidia GeForce GTX 480, OpenCL

February 10, 2012 by hgpu

An OpenCL implementation for the solution of TDSE on GPU and CPU architectures

Cathal O'Broin, Lampros A. A. Nikolopoulos

View

Download (PDF)

Tags: Algorithms, ATI, ATI FirePro V7800, Computational Physics, Differential equations, nVidia, ODEs, OpenCL, Ordinary differential equations, Physics, Quantum Physics, Tesla S1070

January 31, 2012 by hgpu

Specx: Speculative task-based runtime system

Specx: a C++ task-based runtime system for heterogeneous distributed architectures

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

KIS-S: A GPU-Aware Kubernetes Inference Simulator with RL-Based Auto-Scaling

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

ParEval-Repo: A Benchmark Suite for Evaluating LLMs with Repository-level HPC Translation Tasks

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

Libra: Synergizing CUDA and Tensor Cores for High-Performance Sparse Matrix Multiplication

exa-AMD: Exascale Accelerated Materials Discovery

Accelerated discovery and design of Fe-Co-Zr magnets with tunable magnetic anisotropy through machine learning and parallel computing

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

No More Shading Languages: Compiling C++ to Vulkan Shaders

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

PyFAI: a Python library for high performance azimuthal integration on GPU

Towards Portable Performance for Explicit Hydrodynamics Codes

OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures

Evaluating the Performance of Legacy Applications on Emerging Parallel Architectures

An Investigation of the Performance Portability of OpenCL

Developing Performance-Portable Molecular Dynamics Kernels in OpenCL

PyFAI, a versatile library for azimuthal regrouping

A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data

Heterogeneous GPU&CPU cluster for High Performance Computing in cryptography

Acceleration of CFD and data analysis using graphics processors

Ensemble K-means on multi-core architectures

An OpenCL implementation for the solution of TDSE on GPU and CPU architectures

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)