high performance computing on graphics processing units: hgpu.org

Posts

Oct, 15

Parallel Programming using OpenCL on Modern Architectures

This report is intended as a quick introduction to the OpenCL framework and the aim is to facilitate a smooth transfer into the use OpenCL C for developers with previous GPGPU experience. The purpose of OpenCL is to allow for developers to use all compute resources available on a heterogeneous hardware platform. As well as […]

OpenCL

Oct, 5

Is the game worth the candle? Evaluation of OpenCL for object detection algorithm optimization

In this paper we present out experiences with the implementation of an object detector using OpenCL. With this implementation we fullfil the need for fast and robust object detection, necessary in many applications in multiple domains (surveillance, traffic, image retrieval, …). The algorithm lends itself to be implemented in a parallel way. We exploit this […]

OpenCL

Oct, 1

Accelerated Pressure Projection using OpenCL on GPUs

A GPU version of the pressure projection solver using OpenCL is implemented. Then it has been compared with CPU version which is accelerated with OpenMP. The GPU version shows a sensible reduction in time despite using a simple algorithm in the kernel. The nal code is plugged into a commercial uid simulator software. Dierent kinds […]

OpenCL

Sep, 30

ARVO-CL: The OpenCL version of the ARVO package – An efficient tool for computing the accessible surface area and the excluded volume of proteins via analytical equations

Introduction of Graphical Processing Units (GPUs) and computing using GPUs in recent years opened possibilities for simple parallelization of programs. In this update, we present the modernized version of program ARVO [J. Busa, J. Dzurina, E. Hayryan, S. Hayryan, C.-K. Hu, J. Plavka, I. Pokorny, J. Skivanek, M.-C. Wu, Comput. Phys. Comm. 165 (2005) 59]. […]

OpenCL

Sep, 27

Lattice QCD based on OpenCL

We present an OpenCL-based Lattice QCD application using a heatbath algorithm for the pure gauge case and Wilson fermions in the twisted mass formulation. The implementation is platform independent and can be used on AMD or NVIDIA GPUs, as well as on classical CPUs. On the AMD Radeon HD 5870 our double precision dslash implementation […]

OpenCL

Sep, 26

Enabling Development of OpenCL Applications on FPGA platforms

FPGAs can potentially deliver tremendous acceleration in high-performance server and embedded computing applications. Whether used to augment a processor or as a stand-alone device, these reconfigurable architectures are being deployed in a large number of implementations owing to the massive amounts of parallelism offered. At the same time, a significant challenge encountered in their wide-spread […]

OpenCL

Sep, 24

Resolution of the Vlasov-Maxwell system by PIC Discontinuous Galerkin method on GPU with OpenCL

We present an implementation of a Vlasov-Maxwell solver for multicore processors. The Vlasov equation describes the evolution of charged particles in an electromagnetic field, solution of the Maxwell equations. The Vlasov equation is solved by a Particle-In-Cell method (PIC), while the Maxwell system is computed by a Discontinuous Galerkin method. We use the OpenCL framework, […]

OpenCL

Sep, 22

Modification of self-organizing migration algorithm for OpenCL framework

This paper deals with modification of self-organizing migration algorithm using the OpenCL framework. This modification allows the algorithm to exploit modern parallel devices, like central processing units and graphics processing units. The main aim was to create algorithm which shows significant speedup when compared to sequential variant. Second aim was to create the algorithm robust […]

OpenCL

Sep, 20

GPU-Acceleration of Linear Algebra using OpenCL

In this report we’ve created a linear algebra API using OpenCL, for use with MATLAB. We’ve demonstrated that the individual linear algebra components can be faster when using the GPU as compared to the CPU. We found that the API is heavily memory bound, but still faster than MATLAB in our testcase. The API components […]

OpenCL

Sep, 17

Parallel hybrid SAT solving using OpenCL

In the last few decades there have been substantial improvements in approaches for solving the Boolean satisfiability problem. Many of these consisted in elaborating on existing algorithms, both on the side of complete solvers as in the area of incomplete solvers. Besides the improvements to existing solving methods, however, recent evolutions in SAT solving take […]

OpenCL

Sep, 8

Lost in Translation: Challenges in Automating CUDA-to-OpenCL Translation

The use of accelerators in high-performance computing is increasing. The most commonly used accelerator is the graphics processing unit (GPU) because of its low cost and massively parallel performance. The two most common programming environments for GPU accelerators are CUDA and OpenCL. While CUDA runs natively only on NVIDIA GPUs, OpenCL is an open standard […]

CUDA

•

OpenCL

Aug, 26

Is OpenCL a suitable platform for algorithm development in health care systems?

This thesis reviews if OpenCL is a suitable and cost effective platform for algorithm development in health care systems. Aspects such as maintainability, performance, portability and integration with high-level languages (in this case Python) are analyzed. The review is done by implementing one part of a dose calculation algorithm that is complex enough to provide […]

OpenCL

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Parallel Programming using OpenCL on Modern Architectures

Is the game worth the candle? Evaluation of OpenCL for object detection algorithm optimization

Accelerated Pressure Projection using OpenCL on GPUs

ARVO-CL: The OpenCL version of the ARVO package – An efficient tool for computing the accessible surface area and the excluded volume of proteins via analytical equations

Lattice QCD based on OpenCL

Enabling Development of OpenCL Applications on FPGA platforms

Resolution of the Vlasov-Maxwell system by PIC Discontinuous Galerkin method on GPU with OpenCL

Modification of self-organizing migration algorithm for OpenCL framework

GPU-Acceleration of Linear Algebra using OpenCL

Parallel hybrid SAT solving using OpenCL

Lost in Translation: Challenges in Automating CUDA-to-OpenCL Translation

Is OpenCL a suitable platform for algorithm development in health care systems?

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)