high performance computing on graphics processing units: hgpu.org

Posts

Apr, 26

OpenCL-Based FPGA Accelerator for 3D FDTD with Periodic and Absorbing Boundary Conditions

Finite difference time domain (FDTD) method is a very poplar way of numerically solving partial differential equations. FDTD has a low operational intensity so that the performances in CPUs and GPUs are often restricted by the memory bandwidth. Recently, deeply pipelined FPGA accelerators have shown a lot of success by exploiting streaming data flows in […]

OpenCL

Apr, 26

OpenCL JIT Compilation for Dynamic Programming Languages

Graphics Processor Units (GPUs) are powerful hardware to parallelize and speed-up applications. However, programming these devices is too complex for most users and the existing standards for GPU programming are available only for low-level languages such as C. Dynamic programming languages offer higher abstractions and functionality for many users. GPU programming is possible for dynamic […]

OpenCL

Apr, 23

4th International Conference on Biomedical and Bioinformatics Engineering (ICBBE), 2017

ICBBE 2017 is to bring together innovative academics and industrial experts in the field of Biomedical and Bioinformatics Engineering to a common forum. The primary goal of the conference is to promote research and developmental activities in Biomedical and Bioinformatics Engineering. Another goal is to promote scientific information interchange between researchers, developers, engineers, students, and […]

Apr, 23

9th International Conference on Signal Processing Systems (ICSPS), 2017

2017 9th International Conference on Signal Processing Systems (ICSPS 2017) is the main annual research conference aims to bring together top researchers around the world to exchange research results and address open issues in all aspects of Signal Processing Systems. Publication Two options: 1 Conference Proceedings, Ei Compendex and Scopus and submitted to be reviewed […]

Apr, 23

The 5th International conference on Control, Mechatronics and Automation (ICCMA), 2017

2017 The 5th International conference on Control, Mechatronics and Automation will be held in University of Alberta, Canada during October 11-13, 2017. ICCMA 2013 was held in Sydney, ICCMA 2014 was held in Dubai, ICCMA 2015 and ICCMA 2016 were both held in Barcelona. The idea of the conference is for the scientists, scholars, engineers […]

Apr, 23

8th International Conference on Biology, Environment and Chemistry (ICBEC), 2017

2017 8th International Conference on Biology, Environment and Chemistry (ICBEC 2017) will be held in Busan, South Korea during October 11-13, 2017. ICBEC 2017 is sponsored by the Hong Kong Chemical, Biological & Environmental Engineering Society (HKICBEES). It is one of the leading international conferences for presenting novel and fundamental advances in the fields of […]

Apr, 23

2nd International Conference on Communication and Information Systems (ICCIS), 2017

ICCIS 2017 will be a perfect platform to share experience, foster collaborations across industry and academia, and evaluate emerging technologies across the globe. Publication Peer reviewed and presented papers in ICCIS 2017 will be published in the conference proceedings, which will be submitted for Ei Compendex and Scopus index. Submission Methods Full Paper(publication and oral […]

Apr, 20

Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL

Increasingly complex memory systems and onchip interconnects are developed to mitigate the data movement bottlenecks in manycore processors. One example of such a complex system is the Xeon Phi KNL CPU with three different types of memory, fifteen memory configuration options, and a complex on-chip mesh network connecting up to 72 cores. Users require a […]

Apr, 20

Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption

Many modern parallel computing systems are heterogeneous at their node level. Such nodes may comprise general purpose CPUs and accelerators (such as, GPU, or Intel Xeon Phi) that provide high performance with suitable energy-consumption characteristics. However, exploiting the available performance of heterogeneous architectures may be challenging. There are various parallel programming frameworks (such as, OpenMP, […]

CUDA

•

OpenCL

Apr, 20

Exploration of cyber-physical systems for GPGPU computer vision-based detection of biological viruses

This work presents a method for a computer vision-based detection of biological viruses in PAMONO sensor images and, related to this, methods to explore cyber-physical systems such as those consisting of the PAMONO sensor, the detection software, and processing hardware. The focus is especially on an exploration of Graphics Processing Units (GPU) hardware for "General-Purpose […]

OpenCL

Apr, 20

A hybrid CPU-GPU parallelization scheme of variable neighborhood search for inventory optimization problems

In this paper, we study various parallelization schemes for the Variable Neighborhood Search (VNS) metaheuristic on a CPU-GPU system via OpenMP and OpenACC. A hybrid parallel VNS method is applied to recent benchmark problem instances for the multi-product dynamic lot sizing problem with product returns and recovery, which appears in reverse logistics and is known […]

Apr, 20

Evaluation of GPU-based track-triggering for the CMS detector at CERN’s HL-LHC

In this work we present an evaluation of GPUs as a possible L1 Track Trigger for the High Luminosity LHC, effective after Long Shutdown 3 around 2025. The novelty lies in presenting an implementation based on calculations done entirely in software, in contrast to currently discussed solutions relying on specialized hardware, such as FPGAs and […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

OpenCL-Based FPGA Accelerator for 3D FDTD with Periodic and Absorbing Boundary Conditions

OpenCL JIT Compilation for Dynamic Programming Languages

4th International Conference on Biomedical and Bioinformatics Engineering (ICBBE), 2017

9th International Conference on Signal Processing Systems (ICSPS), 2017

The 5th International conference on Control, Mechatronics and Automation (ICCMA), 2017

8th International Conference on Biology, Environment and Chemistry (ICBEC), 2017

2nd International Conference on Communication and Information Systems (ICCIS), 2017

Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL

Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption

Exploration of cyber-physical systems for GPGPU computer vision-based detection of biological viruses

A hybrid CPU-GPU parallelization scheme of variable neighborhood search for inventory optimization problems

Evaluation of GPU-based track-triggering for the CMS detector at CERN’s HL-LHC

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)