8688

Posts

Dec, 5

Parallel Cosegmentation via Submodular Optimization on Anisotropic Diffusion

With large number of related images being used for applications such as MR spectroscopy imaging, Object of interest 3D modelling and photo collages, the need of the hour is to accelerate image cosegmentation algorithms. Cosegmentation refers to the process of segmenting common regions from multiple related images. A novel distributed algorithm, CoSand [1], for cosegmentation […]
Dec, 5

Gauge Field Generation on Large-Scale GPU-Enabled Systems

Over the past years GPUs have been successfully applied to the task of inverting the fermion matrix in lattice QCD calculations. Even strong scaling to capability-level supercomputers, corresponding to O(100) GPUs or more has been achieved. However strong scaling a whole gauge field generation algorithm to this regim requires significantly more functionality than just having […]
Dec, 5

Usage of GPU in LS-DYNA

The increasing computing power of GPUs can be used to improve the performance of CAE systems.[1]. Within LS-DYNA an improved direct equation solver can be used, which accelerates the performance of implicit applications by use of a CUDA-based solver [2], [3], [4]. In this paper the performance improvements for different customer input decks for metal […]
Dec, 4

Fast Parallel Sorting Algorithms on GPUs

This paper presents a comparative analysis of the three widely used parallel sorting algorithms: OddEven sort, Rank sort and Bitonic sort in terms of sorting rate, sorting time and speed-up on CPU and different GPU architectures. Alongside we have implemented novel parallel algorithm: min-max butterfly network, for finding minimum and maximum in large data sets. […]
Dec, 4

gR: A GPU-based Router

With the growing internet traffic and complexity of packet processing task, the throughput of routers is affected. Also modern routers need to provide additional services like security, QOS which further adds to the complexity. These issues can be addressed with the massive parallel computing capability of graphic processors. In this paper, we offload two of […]
Dec, 4

FusionSim: Characterizing the Performance Benefits of Fused CPU/GPU Systems

We present FusionSim, a modeling framework capable of cycle-accurate simulation of a complete x86-based computer system with (a) a CPU and a GPU on the same die, and (b) a CPU and a GPU connected as separate components. We use FusionSim to characterize the performance of the Rodinia benchmarks on fused and discrete systems. We […]
Dec, 4

A MPI back-end for the OpenACC accULL. Exploiting OpenACC semantics in Message Passing Clusters

The irruption in the HPC scene of hardware acceletarors has made available unprecedented performance to developers. However, even expert developers may not be ready to exploit the complex hierarchies of these new heterogeneous systems. We need to find a way to leverage the programming effort in these architectures at programming language level, otherwise, developers will […]
Dec, 4

Molecular dynamics for long-range interacting systems on Graphic Processing Units

We present implementations of a fourth-order symplectic integrator on graphic processing units for three $N$-body models with long-range interactions of general interest: the Hamiltonian Mean Field, Ring and two-dimensional self-gravitating models. We discuss the algorithms, speedups and errors using one and two GPU units. Speedups can be as high as 140 compared to a serial […]
Dec, 3

GPU-Based Implementation of JPEG2000 Encoder

JPEG2000 has become one of the most rewarding image coding standards. It provides a practical set of features which weren’t necessarily available in the previous standards. The features were realized as a result of two new techniques, namely the Discrete Wavelet Transform (DWT), and Embedded Block Coding with Optimized Truncation (EBCOT). The complexity of EBCOT […]
Dec, 3

Hybrid Sample-based Surface Rendering

The performance of rasterization-based rendering on current GPUs strongly depends on the abilities to avoid overdraw and to prevent rendering triangles smaller than the pixel size. Otherwise, the rates at which high-resolution polygon models can be displayed are affected significantly. Instead of trying to build these abilities into the rasterization-based rendering pipeline, we propose an […]
Dec, 3

Simulations of Complex and Microscopic Models of Cardiac Electrophysiology Powered by Multi-GPU Platforms

Key aspects of cardiac electrophysiology, such as slow conduction, conduction block, and saltatory effects have been the research topic of many studies since they are strongly related to cardiac arrhythmia, reentry, fibrillation, or defibrillation. However, to reproduce these phenomena the numerical models need to use subcellular discretization for the solution of the PDEs and nonuniform, […]
Dec, 3

GPU-based Space Situational Awareness Simulation utilising parallelism for enhanced multi-sensor management

As a result of continual space activity since the 1950s, there are now a large number of man-made Resident Space Objects (RSOs) orbiting the Earth. Because of the large number of items and their relative speeds, the possibility of destructive collisions involving important space assets is now of significant concern to users and operators of […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org