high performance computing on graphics processing units: hgpu.org

hgpu.org » Playstation

A New Architecture for Games and Simulations Using GPUs

Mark Joselli, Cristina Nader Vasconcelos, Esteban Clua

View

Tags: Artificial intelligence, Computer science, CUDA, Game physics, Games, nVidia, nVidia GeForce 8800 GTS, OpenGL, Playstation, Xbox 360

April 14, 2014 by hgpu

Methods for Optimizing OpenCL Applications on Heterogeneous Multicore Architectures

Slo-Li Chu, Chih-Chieh Hsiao

View

Download (PDF)

Tags: ATI, ATI Radeon HD 4670, ATI Radeon HD 5850, Computer science, Heterogeneous systems, nVidia, nVidia GeForce GT 220, nVidia GeForce GTX 285, OpenCL, Optimization, Playstation

July 19, 2013 by hgpu

On the Cryptanalysis of Public-Key Cryptography

Joppe Willem Bos

View

Download (PDF)

Tags: Computer science, CUDA, Elliptic curves, Factorization, Modular arithmetic, nVidia, Playstation, Security, Thesis

April 2, 2012 by hgpu

Dynamic adaptation and distribution of binaries to heterogeneous architectures

Espen Angell Kristiansen

View

Download (PDF)

Source codes

Tags: Computer science, CUDA, Distributed computing, Heterogeneous systems, nVidia, nVidia GeForce GT 220, nVidia GeForce GTX 280, OpenCL, Package, Playstation, Thesis

November 22, 2011 by hgpu

A capabilities-aware framework for using computational accelerators in data-intensive computing

M. Mustafa Rafique, Ali R. Butta, Dimitrios S. Nikolopoulos

View

Download (PDF)

Tags: Cell processor, Cloud, Computer science, CUDA, GPU cluster, Heterogeneous systems, MapReduce, nVidia, nVidia GeForce 9600 M GT, Playstation

November 14, 2011 by hgpu

Analyzing Use of OpenCL on the Cell Broadband Engine and a Proposal for OpenCL Extensions

Jens Breitbart, Claudia Fohry

View

Download (PDF)

Tags: Cell processor, Computer science, Distributed computing, Heterogeneous systems, Matrix multiplication, nVidia, nVidia GeForce GTX 280, OpenCL, Optimization, Playstation

September 24, 2011 by hgpu

Reusable software components for accelerator-based clusters

M. Mustafa Rafique, Ali R. Butt, Eli Tilevich

View

Download (PDF)

Tags: Computer science, CUDA, Heterogeneous systems, nVidia, nVidia GeForce 9600 M GT, Performance, Playstation

August 22, 2011 by hgpu

Acceleration of Multiresolution Imaging Algorithms: A Comparative Study

Richard Membarth, Philipp Kutzer, Hritam Dutta, Frank Hannig, Jurgen Teich

View

Download (PDF)

Tags: Cell processor, CUDA, Filtering, Image processing, nVidia, Playstation, Tesla C870

July 3, 2011 by hgpu

A Tuning Framework for Software-Managed Memory Hierarchies

Manman Ren, Ji Young Park, Mike Houston, Alex Aiken, William J. Dally

View

Download (PDF)

Tags: Algorithms, Cell processor, Compilers, Computer science, Optimization, Performance, Playstation, Programming Languages

February 26, 2011 by hgpu

An experimental study on performance portability of OpenCL kernels

Sean Rul, Hans Vandierendonck, Joris D'Haene and Koen De Bosschere

View

Download (PDF)

Tags: ATI, ATI FirePro V8700, Cell processor, Computer science, nVidia, OpenCL, Performance, Playstation, Tesla C1060

February 16, 2011 by hgpu

Mixing Multi-Core CPUs and GPUs for Scientific Simulation Software

K.A. Hawick, A. Leist, and D.P. Playne

View

Download (PDF)

Tags: Cell processor, Computer science, CUDA, Data parallelism, Finite difference, nVidia, nVidia GeForce GTX 295, OpenCL, Partial differential equations, PDEs, Playstation, Pseudo-random number generators

February 14, 2011 by hgpu

Comparing Intra- and Inter-Processor Parallelism on Multi-Core Cell Processors for Scientific Simulations

K.A. Hawick, A. Leist and D.P. Playne and M.J. Johnson

View

Download (PDF)

Tags: Cell processor, Computer science, Playstation, Pseudo-random number generators

February 14, 2011 by hgpu

Specx: Speculative task-based runtime system

Specx: a C++ task-based runtime system for heterogeneous distributed architectures

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

KIS-S: A GPU-Aware Kubernetes Inference Simulator with RL-Based Auto-Scaling

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

ParEval-Repo: A Benchmark Suite for Evaluating LLMs with Repository-level HPC Translation Tasks

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

Libra: Synergizing CUDA and Tensor Cores for High-Performance Sparse Matrix Multiplication

exa-AMD: Exascale Accelerated Materials Discovery

Accelerated discovery and design of Fe-Co-Zr magnets with tunable magnetic anisotropy through machine learning and parallel computing

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

No More Shading Languages: Compiling C++ to Vulkan Shaders

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

A New Architecture for Games and Simulations Using GPUs

Methods for Optimizing OpenCL Applications on Heterogeneous Multicore Architectures

On the Cryptanalysis of Public-Key Cryptography

Dynamic adaptation and distribution of binaries to heterogeneous architectures

Analyzing Use of OpenCL on the Cell Broadband Engine and a Proposal for OpenCL Extensions

Reusable software components for accelerator-based clusters

Acceleration of Multiresolution Imaging Algorithms: A Comparative Study

A Tuning Framework for Software-Managed Memory Hierarchies

An experimental study on performance portability of OpenCL kernels

Mixing Multi-Core CPUs and GPUs for Scientific Simulation Software

Comparing Intra- and Inter-Processor Parallelism on Multi-Core Cell Processors for Scientific Simulations

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)