high performance computing on graphics processing units: hgpu.org

Posts

Apr, 2

A Novel Open Source Morphology Using GPU Processing With LTU-CUDA

A mathematical morphology is used as a tool for extracting image components that are useful in the representation and description of region shape. The mathematical morphology operations of dilation, erosion, opening, and closing are important building blocks of many other image processing algorithms. The data parallel programming provides an opportunity for performance acceleration using highly […]

CUDA

Apr, 2

2014 5th International Conference on Networking and Information Technology, ICNIT 2014

2014-09-20 Accepted papers will be published in the one of the following Journals with ISSN. International Journal of Computer and Communication Engineering (IJCCE, ISSN: 2010-3743) Journal of Advances in Computer Networks (JACN, ISSN: 1793-8244) Journal of Communications (ISSN: 1796-2021) 3G & 4G Mobile Communication Services Agents and Multi-Agents systems for ICT Integrated Circuits for Communications […]

Apr, 1

GPU Based Performance Acceleration of Radar Imaging Algorithms

We consider the performance acceleration of the conventional Time Domain Backprojection and Kirchhoff Migration algorithms for imaging concealed targets. The Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL) are used here for accelerating these algorithms on Graphics Processing Units (GPUs). Data generated by means of analytical methods, simulation and experiment are used for […]

CUDA

•

OpenCL

Apr, 1

Enhanced Parallel NegaMax Tree Search Algorithm on GPU

Parallel performance for GPUs today surpasses the traditional multi-core CPUs. Currently, many AI algorithms started to be tested on GPUs rather than CPUs, especially after the release of libraries such as Cuda and OpenCL that allows the implementation of general algorithms on the GPU. One of the most famous game tree search algorithms is Negamax, […]

CUDA

•

OpenCL

Apr, 1

Code Generation for Embedded Heterogeneous Architectures on Android

The success of Android is based on its unified Java programming model that allows to write platform-independent programs for a variety of different target platforms. However, this comes at the cost of performance. As a consequence, Google introduced APIs that allow to write native applications and to exploit multiple cores as well as embedded GPUs […]

Mar, 31

Data Mining Techniques in Parallel and Distributed Environment – A Comprehensive Survey

Distributed sources of voluminous data have raised the need of distributed data mining. Conventional data mining techniques works well on structured data which is clean, pre-processed and properly arranged either in the form of structured files, databases or data warehouse. These techniques are based upon centralised data store however they have several limitations in distributed […]

CUDA

Mar, 31

The 22nd Annual International IEEE Symposium on Field Programmable Custom Computing, FCCM 2014

The IEEE Symposium on Field Programmable Custom Computing Machines is the original and premier forum for presenting and discussing new research related to computing that exploits the unique features and capabilities of FPGAs and other reconfigurable hardware. Over the past two decades, FCCM has been the place to present papers on architectures, tools, and programming […]

Mar, 30

Using CUDA Architecture for the Computer Simulation of the Casting Solidification Process

his paper presents a simulation of the casting solidification process performed on graphics processors compatible with nVidia CUDA architecture. The new approach shown in this paper allows the process of matrix building to be divided into two independent phases. The first is independent from the nodal temperature values computed in successive time–steps. The second is […]

CUDA

Mar, 30

A New High Performance GPU-based Approach to Prime Numbers Generation

SIMD Parallelization is one of the most useful ways of decreasing the computation time and increases the performance of computation intensive algorithms. To do such process, we could execute some processes on several machines by using different platforms like MPI, OpenMP and distribute the workload by using message passing and shared memory. One of the […]

CUDA

Mar, 29

Exploiting GPUs to investigate an inversion method that retrieves cardiac conductivities from potential measurements

Accurate cardiac bidomain conductivity values are essential for realistic simulation of various cardiac electrophysiological phenomena. A method was previously developed that can determine the conductivities from measurements of potential on a multi-electrode array placed on the surface of the heart. These conductivities, as well as a value for fibre rotation, are determined using a mathematical […]

CUDA

Mar, 29

Literature review: Build and Travel KD-Tree with CUDA

Ray tracing is an important and widely used tool in computer graphic. Entertainment and game industry have already benet a lot from ray tracing. However, designers and end-users are forced to use off-line ray tracing tools for a long time due to the high computation load. In ray tracing, most of the computation is concentrated […]

CUDA

Mar, 29

Hardware/Software Vectorization for Closeness Centrality on Multi-/Many-Core Architectures

Centrality metrics have shown to be highly correlated with the importance and loads of the nodes in a network. Given the scale of today’s social networks, it is essential to use efficient algorithms and high performance computing techniques for their fast computation. In this work, we exploit hardware and software vectorization in combination with fine-grain […]

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Posts

A Novel Open Source Morphology Using GPU Processing With LTU-CUDA

2014 5th International Conference on Networking and Information Technology, ICNIT 2014

GPU Based Performance Acceleration of Radar Imaging Algorithms

Enhanced Parallel NegaMax Tree Search Algorithm on GPU

Code Generation for Embedded Heterogeneous Architectures on Android

Data Mining Techniques in Parallel and Distributed Environment – A Comprehensive Survey

The 22nd Annual International IEEE Symposium on Field Programmable Custom Computing, FCCM 2014

Using CUDA Architecture for the Computer Simulation of the Casting Solidification Process

A New High Performance GPU-based Approach to Prime Numbers Generation

Exploiting GPUs to investigate an inversion method that retrieves cardiac conductivities from potential measurements

Literature review: Build and Travel KD-Tree with CUDA

Hardware/Software Vectorization for Closeness Centrality on Multi-/Many-Core Architectures

Recent source codes

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

Most viewed papers (last 30 days)