high performance computing on graphics processing units: hgpu.org

Posts

Sep, 19

Gauge fixing using overrelaxation and simulated annealing on GPUs

We adopt CUDA-capable Graphic Processing Units (GPUs) for Coulomb, Landau and maximally Abelian gauge fixing in 3+1 dimensional SU(3) lattice gauge field theories. The local overrelaxation algorithm is perfectly suited for highly parallel architectures. Simulated annealing preconditioning strongly increases the probability to reach the global maximum of the gauge functional. We give performance results for […]

CUDA

Sep, 18

Implementation of QR Updating Algorithms on the GPU

The least squares problem is an extremely useful device to represent an approximate solution to overdetermined systems, and a QR factorisation is a common method for solving least squares problems. It is often the case that multiple least squares solutions have to be computed with only minor changes in the underlying data. In this case, […]

CUDA

Sep, 18

The Architecture and Evolution of CPU-GPU Systems for General Purpose Computing

GPU computing has emerged in recent years as a viable execution platform for throughput oriented applications or regions of code. GPUs started out as independent units for program execution but there are clear trends towards tight-knit CPU-GPU integration. In this work, we will examine existing research directions and future opportunities for chip integrated CPU-GPU systems. […]

Sep, 18

Quasi-real-time analysis of dynamic near field scattering data using a graphics processing unit

We present an implementation of the analysis of dynamic near field scattering (NFS) data using a graphics processing unit (GPU). We introduce an optimized data management scheme thereby limiting the number of operations required. Overall, we reduce the processing time from hours to minutes, for typical experimental conditions. Previously the limiting step in such experiments, […]

CUDA

Sep, 18

High-throughput Execution of Hierarchical Analysis Pipelines on Hybrid Cluster Platforms

We propose, implement, and experimentally evaluate a runtime middleware to support high-throughput execution on hybrid cluster machines of large-scale analysis applications. A hybrid cluster machine consists of computation nodes which have multiple CPUs and general purpose graphics processing units (GPUs). Our work targets scientific analysis applications in which datasets are processed in application-specific data chunks, […]

CUDA

Sep, 18

Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines

In this paper, we address the problem of efficient execution of a computation pattern, referred to here as the irregular wavefront propagation pattern (IWPP), on hybrid systems with multiple CPUs and GPUs. The IWPP is common in several image processing operations. In the IWPP, data elements in the wavefront propagate waves to their neighboring elements […]

CUDA

Sep, 17

The 4rd International Workshop of GPU and MIC Solutions to Multiscale Problems in Science and Engineering (GPU-SMP’2013), 2013

TOPICS OF INTEREST AT THE CONFERENCE: Some topics are mentioned below but are not restricted to 1. Large-scale problems using GPU and hybrid systems 2. physical, chemical, biological, geological and industrial applications 3. Techniques for optimizing kernels in GPU and other many-core systems (MIC) 4. mixed precision computing 5. Benchmarking and performance evaluation for GPU, MIC, and hybrid systems 6. Visualization tools and techniques […]

Sep, 17

Seismic damage simulation for urban buildings based on high-performance GPU computing

Refined models have been an important development trend of urban regional seismic damage prediction. However, the application of refined models has been limited due to their high computational cost if implemented on traditional Central Processing Unit (CPU) platforms. In recent years, Graphics Processing Unit (GPU) technology has been developed and applied rapidly due to its […]

CUDA

Sep, 17

A Simulation Framework for Scheduling Performance Evaluation on CPU-GPU Heterogeneous System

Modern PCs are equipped with multi-many core capabili-ties which enhance their computational power and address important issues related to the efficiency of the scheduling processes of the modern operating system in such hybrid architectures. The aim of our work is to implement a simulation framework devoted to the study of the scheduling process in hybrid […]

OpenCL

Sep, 17

A GPU operations framework for WattDB

In the last decades, rising energy consumption and production became one of the main problems of humanity. Energy efficiency can help save energy. GPUs are an example of highly energy-efficient hardware. However, energy efficiency is not enough, energy proportionality is needed. The objective of this work is to create an entire platform that allows execution […]

CUDA

Sep, 17

Parallel hybrid SAT solving using OpenCL

In the last few decades there have been substantial improvements in approaches for solving the Boolean satisfiability problem. Many of these consisted in elaborating on existing algorithms, both on the side of complete solvers as in the area of incomplete solvers. Besides the improvements to existing solving methods, however, recent evolutions in SAT solving take […]

OpenCL

Sep, 16

Massive parallelization of combinatorial statistical genetics analyses porting machine learning methods on general purpose graphics processing units (GPU)

Recent advances in sequencing technology and automated phenotyping render it possible to study the relationship between genotype and phenotype at an unprecedented level of detail. While mapping phenotypes to single loci in the genome is a standard technique in Statistical Genetics, the problem of epistasis search, that is mapping phenotypes to pairs of loci, remains […]

CUDA

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Posts

Gauge fixing using overrelaxation and simulated annealing on GPUs

Implementation of QR Updating Algorithms on the GPU

The Architecture and Evolution of CPU-GPU Systems for General Purpose Computing

Quasi-real-time analysis of dynamic near field scattering data using a graphics processing unit

High-throughput Execution of Hierarchical Analysis Pipelines on Hybrid Cluster Platforms

Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines

The 4rd International Workshop of GPU and MIC Solutions to Multiscale Problems in Science and Engineering (GPU-SMP’2013), 2013

Seismic damage simulation for urban buildings based on high-performance GPU computing

A Simulation Framework for Scheduling Performance Evaluation on CPU-GPU Heterogeneous System

A GPU operations framework for WattDB

Parallel hybrid SAT solving using OpenCL

Massive parallelization of combinatorial statistical genetics analyses porting machine learning methods on general purpose graphics processing units (GPU)

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)