high performance computing on graphics processing units: hgpu.org

Posts

Jun, 6

The Tradeoffs of Fused Memory Hierarchies in Heterogeneous Computing Architectures

With the rise of general purpose computing on graphics processing units (GPGPU), the influence from consumer markets can now be seen across the spectrum of computer architectures. In fact, many of the high-ranking Top500 HPC systems now include these accelerators. Traditionally, GPUs have connected to the CPU via the PCIe bus, which has proved to […]

OpenCL

Jun, 6

Relativistic Hydrodynamics on Graphic Cards

We show how to accelerate relativistic hydrodynamics simulations using graphic cards (graphic processing units, GPUs). These improvements are of highest relevance e.g. to the field of high-energetic nucleus-nucleus collisions at RHIC and LHC where (ideal and dissipative) relativistic hydrodynamics is used to calculate the evolution of hot and dense QCD matter. The results reported here […]

OpenCL

Jun, 6

Parallel Spherical Harmonic Transforms on heterogeneous architectures (GPUs/multi-core CPUs)

Spherical Harmonic Transforms (SHT) are at the heart of many scientific and practical applications ranging from climate modelling to cosmological observations. In many of these areas new, cutting-edge science goals have been recently proposed requiring simulations and analyses of experimental or observational data at very high resolutions and of unprecedented volumes. Both these aspects pose […]

CUDA

Jun, 5

European Seminar on Computing, ESCO 2012

ESCO 2012 is the 3rd event in a successful series of interdisciplineary meetings dedicated to modern methods and practices of scientific computing. Main thematic areas include: Multiphysics coupled problems, Higher-order computational methods, Computing with Python, GPU computing, and Cloud computing. Theoretical results as well as applications are welcome. Application areas include, but are not limited […]

Jun, 5

2nd International Conference on Information Management in the Knowledge Economy, IMKE – 2013

The International Conference on Information Management in the Knowledge Economy is a multidisciplinary Conference on digital information management, science and technology. The principal aim of this conference is to bring professionals in academia, research laboratories and industry together, and offer a collaborative platform to address the emerging issues and solutions in digital information science and […]

Jun, 5

Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2013

ASPLOS is the premier forum for multidisciplinary systems research spanning computer architecture and hardware, programming languages and compilers, operating systems and security, as well as applications and human-computer interaction. The importance of such crosscutting systems research has been growing hand in hand with the amount of parallelism in hardware, the scope of distribution in internet-scale […]

Jun, 5

A new parallelisation technique for heterogeneous CPUs

Parallelization has moved in recent years into the mainstream compilers, and the demand for parallelizing tools that can do a better job of automatic parallelization is higher than ever. During the last decade considerable attention has been focused on developing programming tools that support both explicit and implicit parallelism to keep up with the power […]

Jun, 5

Using visualization to reveal weak cryptosystems

My thesis explains how we can apply techniques borrowed from the area of visualization to reveal weaknesses in applications and cryptosystems. A presentation of how graphical processing units can be used for general computing is presented in the first half of the thesis. The second half provides an overview of basic techniques and applies these […]

Jun, 5

Hierarchical Partitioning Algorithm for Scientific Computing on Highly Heterogeneous CPU + GPU Clusters

Hierarchical level of heterogeneity exists in many modern high performance clusters in the form of heterogeneity between computing nodes, and within a node with the addition of specialized accelerators, such as GPUs. To achieve high performance of scientific applications on these platforms it is necessary to perform load balancing. In this paper we present a […]

Jun, 5

Shortening design time through multiplatform simulations with a portable OpenCL golden-model: the LDPC decoder case

Hardware designers and engineers typically need to explore a multi-parametric design space in order to find the best configuration for their designs using simulations that can take weeks to months to complete. For example, designers of special purpose chips need to explore parameters such as the optimal bit width and data representation. This is the […]

OpenCL

Jun, 5

Landau Gauge Fixing on GPUs

In this paper we present and explore the performance of Landau gauge fixing in GPUs using CUDA. We consider the steepest descent algorithm with Fourier acceleration, and compare the GPU performance with a parallel CPU implementation. Using $32^4$ lattice volumes, we find that the computational power of a single Tesla C2070 GPU is equivalent to […]

CUDA

Jun, 4

Platform 2012, a Many-Core Computing Accelerator for Embedded SoCs: Performance Evaluation of Visual Analytics Applications

P2012 is an area- and power-efficient many-core computing accelerator based on multiple globally asynchronous, locally synchronous processor clusters. Each cluster features up to 16 processors with independent instruction streams sharing a multi-banked one-cycle access L1 data memory, a multi-channel DMA engine and specialized hardware for synchronization and aggressive power management. P2012 is 3D stacking ready […]

OpenCL

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

The Tradeoffs of Fused Memory Hierarchies in Heterogeneous Computing Architectures

Relativistic Hydrodynamics on Graphic Cards

Parallel Spherical Harmonic Transforms on heterogeneous architectures (GPUs/multi-core CPUs)

European Seminar on Computing, ESCO 2012

2nd International Conference on Information Management in the Knowledge Economy, IMKE – 2013

Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2013

A new parallelisation technique for heterogeneous CPUs

Using visualization to reveal weak cryptosystems

Hierarchical Partitioning Algorithm for Scientific Computing on Highly Heterogeneous CPU + GPU Clusters

Shortening design time through multiplatform simulations with a portable OpenCL golden-model: the LDPC decoder case

Landau Gauge Fixing on GPUs

Platform 2012, a Many-Core Computing Accelerator for Embedded SoCs: Performance Evaluation of Visual Analytics Applications

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)