high performance computing on graphics processing units: hgpu.org

Posts

Sep, 18

The Architecture and Evolution of CPU-GPU Systems for General Purpose Computing

GPU computing has emerged in recent years as a viable execution platform for throughput oriented applications or regions of code. GPUs started out as independent units for program execution but there are clear trends towards tight-knit CPU-GPU integration. In this work, we will examine existing research directions and future opportunities for chip integrated CPU-GPU systems. […]

Sep, 18

Quasi-real-time analysis of dynamic near field scattering data using a graphics processing unit

We present an implementation of the analysis of dynamic near field scattering (NFS) data using a graphics processing unit (GPU). We introduce an optimized data management scheme thereby limiting the number of operations required. Overall, we reduce the processing time from hours to minutes, for typical experimental conditions. Previously the limiting step in such experiments, […]

CUDA

Sep, 18

High-throughput Execution of Hierarchical Analysis Pipelines on Hybrid Cluster Platforms

We propose, implement, and experimentally evaluate a runtime middleware to support high-throughput execution on hybrid cluster machines of large-scale analysis applications. A hybrid cluster machine consists of computation nodes which have multiple CPUs and general purpose graphics processing units (GPUs). Our work targets scientific analysis applications in which datasets are processed in application-specific data chunks, […]

CUDA

Sep, 18

Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines

In this paper, we address the problem of efficient execution of a computation pattern, referred to here as the irregular wavefront propagation pattern (IWPP), on hybrid systems with multiple CPUs and GPUs. The IWPP is common in several image processing operations. In the IWPP, data elements in the wavefront propagate waves to their neighboring elements […]

CUDA

Sep, 17

The 4rd International Workshop of GPU and MIC Solutions to Multiscale Problems in Science and Engineering (GPU-SMP’2013), 2013

TOPICS OF INTEREST AT THE CONFERENCE: Some topics are mentioned below but are not restricted to 1. Large-scale problems using GPU and hybrid systems 2. physical, chemical, biological, geological and industrial applications 3. Techniques for optimizing kernels in GPU and other many-core systems (MIC) 4. mixed precision computing 5. Benchmarking and performance evaluation for GPU, MIC, and hybrid systems 6. Visualization tools and techniques […]

Sep, 17

Seismic damage simulation for urban buildings based on high-performance GPU computing

Refined models have been an important development trend of urban regional seismic damage prediction. However, the application of refined models has been limited due to their high computational cost if implemented on traditional Central Processing Unit (CPU) platforms. In recent years, Graphics Processing Unit (GPU) technology has been developed and applied rapidly due to its […]

CUDA

Sep, 17

A Simulation Framework for Scheduling Performance Evaluation on CPU-GPU Heterogeneous System

Modern PCs are equipped with multi-many core capabili-ties which enhance their computational power and address important issues related to the efficiency of the scheduling processes of the modern operating system in such hybrid architectures. The aim of our work is to implement a simulation framework devoted to the study of the scheduling process in hybrid […]

OpenCL

Sep, 17

A GPU operations framework for WattDB

In the last decades, rising energy consumption and production became one of the main problems of humanity. Energy efficiency can help save energy. GPUs are an example of highly energy-efficient hardware. However, energy efficiency is not enough, energy proportionality is needed. The objective of this work is to create an entire platform that allows execution […]

CUDA

Sep, 17

Parallel hybrid SAT solving using OpenCL

In the last few decades there have been substantial improvements in approaches for solving the Boolean satisfiability problem. Many of these consisted in elaborating on existing algorithms, both on the side of complete solvers as in the area of incomplete solvers. Besides the improvements to existing solving methods, however, recent evolutions in SAT solving take […]

OpenCL

Sep, 16

Massive parallelization of combinatorial statistical genetics analyses porting machine learning methods on general purpose graphics processing units (GPU)

Recent advances in sequencing technology and automated phenotyping render it possible to study the relationship between genotype and phenotype at an unprecedented level of detail. While mapping phenotypes to single loci in the genome is a standard technique in Statistical Genetics, the problem of epistasis search, that is mapping phenotypes to pairs of loci, remains […]

CUDA

Sep, 16

Parallel Benefit on Different Programming Paradigms

Multi-core platforms become ubiquitous nowadays. Even laptops contain multi-core processors now. There are multiple cores in a chip or socket or die. A computing node contains multiple chips. Multi-core platforms are rapidly increasing and the number of cores on these platforms is increasing rapidly too. How to enjoy the benefits of parallel computing on the […]

CUDA

Sep, 16

Parallel Implementation of Moving Averages and Stock Market Prediction

In recent years, graphics processing units have made parallel processing affordable with the price of personal desktop computers. This report investigates the computational aspects of calculating simple moving average and exponential moving average operations, two of the most popular financial indicators. In this report, we also investigate the usage of GPU to run artificial neural […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

The Architecture and Evolution of CPU-GPU Systems for General Purpose Computing

Quasi-real-time analysis of dynamic near field scattering data using a graphics processing unit

High-throughput Execution of Hierarchical Analysis Pipelines on Hybrid Cluster Platforms

Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines

The 4rd International Workshop of GPU and MIC Solutions to Multiscale Problems in Science and Engineering (GPU-SMP’2013), 2013

Seismic damage simulation for urban buildings based on high-performance GPU computing

A Simulation Framework for Scheduling Performance Evaluation on CPU-GPU Heterogeneous System

A GPU operations framework for WattDB

Parallel hybrid SAT solving using OpenCL

Massive parallelization of combinatorial statistical genetics analyses porting machine learning methods on general purpose graphics processing units (GPU)

Parallel Benefit on Different Programming Paradigms

Parallel Implementation of Moving Averages and Stock Market Prediction

Recent source codes

Kernel Library for LLM Serving

Adaptivity in AdaptiveCpp: Optimizing Performance by Leveraging Runtime Information During JIT-Compilation

Neptune: Advanced ML Operator Fusion for Locality and Parallelism on GPUs

Genten: Software for Generalized Tensor Decompositions by Sandia National Laboratories

Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR

Pinocchio: PINpointing Orbit Crossing Collapsed Hierarchical Objects

KernelCoder: trained on a curated dataset of reasoning traces and CUDA kernel pairs

VibeCodeHPC - Multi Agentic Vibe Coding for HPC

Compile-Time Resource Safety for GPU APIs: A Low-Overhead Typestate Framework

exa-AMD: Exascale Accelerated Materials Discovery

Most viewed papers (last 30 days)