high performance computing on graphics processing units: hgpu.org

Posts

Mar, 27

2014 Workshop on Heterogeneous and Unconventional Cluster Architectures and Applications, HUCAA 2014, in conjunction with ICPP2014

The workshop on Heterogeneous and Unconventional Cluster Architectures and Applications gears to gather recent work on heterogeneous and unconventional cluster architectures and applications, which might have a big impact on future cluster architectures. This includes any cluster architecture that is not based on the usual commodity components and therefore makes use of some special hard- […]

Mar, 26

Accelerating GPU Implementation of Contourlet Transform

The widespread usage of the contourlet-transform (CT) and today’s real-time needs demand faster execution of CT. Solutions are available, but due to lack of portability or computational intensity, they are disadvantageous in real-time applications. In this paper we take advantage of modern GPUs for the acceleration purpose. GPU is well-suited to address data-parallel computation applications […]

CUDA

Mar, 26

A New Parallel Implementation of DSI Based Disparity Computation Using CUDA

Stereo matching techniques are used to extract 3D information from 2D stereo pair of images. It can be classified into feature based approach, window (area) based approach, and optimization based approach. Feature based approach generally generates sparse disparity map with high accuracy and low execution time. Window based approach produces dense disparity map with low […]

CUDA

Mar, 25

BigKernel — High Performance CPU-GPU Communication Pipelining for Big Data-style Applications

GPUs offer an order of magnitude higher compute power and memory bandwidth than CPUs. GPUs therefore might appear to be well suited to accelerate computations that operate on voluminous data sets in independent ways; e.g., for transformations, filtering, aggregation, partitioning or other ”Big Data” style processing. Yet experience indicates that it is difficult, and often […]

CUDA

Mar, 25

Interpolation with Radial Basis Functions on GPGPUs using CUDA

This report gives a brief introduction to the interpolation with radial basis functions and it’s application to the deformation of computational grids. The FGP algorithm is quoted as an iterative method for the calculation of the interpolation coefficients. A multipole method is described for the efficient approximation of the required matrix-vector product. Results are presented […]

CUDA

Mar, 25

Distortion correction algorithm for UAV remote sensing image based on CUDA

In China, natural disasters are characterized by wide distribution, severe destruction and high impact range, and they cause significant property damage and casualties every year. Following a disaster, timely and accurate acquisition of geospatial information can provide an important basis for disaster assessment, emergency relief, and reconstruction. In recent years, Unmanned Aerial Vehicle (UAV) remote […]

CUDA

Mar, 24

2014 5th International Conference on Software and Computing Technology, ICSCT 2014

Submission Deadline: 2014-06-10 Publication: All accepted papers that are registered and presented of ICSCT 2014 will be published in WIT Transactions on Information and Communication Technologies (ISSN: 1743-3517), which will be indexed by EI Compendex, Scopus and ISI. Call for Paper: AI and Knowledge based software engineering Artificial Intelligence Aspect-orientation and feature interaction Business Process […]

Mar, 24

The orthorectified technology for UAV aerial remote sensing image based on the Programmable GPU

The increasingly heterogeneous modern hardware landscape is forcing database vendors to rethink basic design decisions: With more and more architectures to support, the traditional approach of building on hand-tuned operators might simply become too cost- and labor-intensive. With this problem in mind, we introduced the notion of a hardware-oblivious database engine, which avoids device-specific optimizations […]

CUDA

Mar, 24

Demonstrating Self-Learning Algorithm Adaptivity in a Hardware-Oblivious Database Engine

OpenCL

Mar, 24

High-Performance Image Synthesis for Radio Interferometry

A radio interferometer indirectly measures the intensity distribution of the sky over the celestial sphere. Since measurements are made over an irregularly sampled Fourier plane, synthesising an intensity image from interferometric measurements requires substantial processing. Furthermore there are distortions that have to be corrected. In this thesis, a new high-performance image synthesis tool (imaging tool) […]

CUDA

Mar, 23

An unsupervised parallel genetic cluster algorithm for graphics processing units

During times of stock market turbulence, monitoring the intraday clustering behaviour of financial instruments allows one to better understand market characteristics and systemic risks. While genetic algorithms provide a versatile methodology for identifying such clusters, serial implementations are computationally intensive and can take a long time to converge to the global optimum. We implement a […]

CUDA

Mar, 23

Computer Vision Accelerators for Mobile Systems based on OpenCL GPGPU Co-Processing

In this paper, we present an OpenCL-based heterogeneous implementation of a computer vision algorithm — image inpainting-based object removal algorithm — on mobile devices. To take advantage of the computation power of the mobile processor, the algorithm workflow is partitioned between the CPU and the GPU based on the profiling results on mobile devices, so […]

OpenCL

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

2014 Workshop on Heterogeneous and Unconventional Cluster Architectures and Applications, HUCAA 2014, in conjunction with ICPP2014

Accelerating GPU Implementation of Contourlet Transform

A New Parallel Implementation of DSI Based Disparity Computation Using CUDA

BigKernel — High Performance CPU-GPU Communication Pipelining for Big Data-style Applications

Interpolation with Radial Basis Functions on GPGPUs using CUDA

Distortion correction algorithm for UAV remote sensing image based on CUDA

2014 5th International Conference on Software and Computing Technology, ICSCT 2014

The orthorectified technology for UAV aerial remote sensing image based on the Programmable GPU

Demonstrating Self-Learning Algorithm Adaptivity in a Hardware-Oblivious Database Engine

High-Performance Image Synthesis for Radio Interferometry

An unsupervised parallel genetic cluster algorithm for graphics processing units

Computer Vision Accelerators for Mobile Systems based on OpenCL GPGPU Co-Processing

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)