high performance computing on graphics processing units: hgpu.org

Posts

Oct, 9

Computer Vision Models in Surveillance Robotics

In this Thesis, we developed algorithms that use visual informations to automatically perform, in real time, detection, recognition and categorisation of moving objects independently on the environmental conditions and with the best accuracy. To this end, we developed upon several concepts of computer vision, namely the identiﬁcation of the objects of interest in the whole […]

CUDA

Oct, 9

Real time ultrasound image denoising

Image denoising is the process of removing the noise that perturbs image analysis methods. In some applications like segmentation or registration, denoising is intended to smooth homogeneous areas while preserving the contours. In many applications like video analysis, visual servoing or image-guided surgical interventions, real-time denoising is required. This paper presents a method for real-time […]

CUDA

Oct, 9

Parallel and efficient Boolean on polygonal solids

We present a novel framework which can efficiently evaluate approximate Boolean set operations for B-rep models by highly parallel algorithms. This is achieved by taking axis-aligned surfels of Layered Depth Images (LDI) as a bridge and performing Boolean operations on the structured points. As compared with prior surfel-based approaches, this paper has much improvement. Firstly, […]

CUDA

Oct, 9

Molecular dynamics simulation of UO2 nanocrystals melting

In this article we study melting of uranium dioxide (UO2) nanocrystals (NC) isolated in vacuum (i.e. non-periodic boundary conditions) using molecular dynamics (MD) in the approximation of pair potentials and rigid ions. We calculate the size dependence of the temperature and heat of melting, the density jump for crystals of cubic shape and volumes up […]

CUDA

Oct, 9

Acceleration of computation speed for elastic wave simulation using a Graphic Processing Unit

Numerical simulation in exploration geophysics provides important insights into subsurface wave propagation phenomena. Although elastic wave simulations take longer to compute than acoustic simulations, an elastic simulator can construct more realistic wavefields including shear components. Therefore, it is suitable for exploration of the responses of elastic bodies. To overcome the long duration of the calculations, […]

CUDA

Oct, 8

Analysis of 3-dimensional electromagnetic fields in dispersive media using cuda

This research presents the implementation of the Finite-Difference Time-Domain (FDTD) method for the solution of 3-dimensional electromagnetic problems in dispersive media using Graphics Processor Units (GPUs). By using the newly introduced CUDA technology, we illustrate the efficacy of GPUs in accelerating the FDTD computations by achieving appreciable speedup factors with great ease and at no […]

CUDA

Oct, 8

Performance improvements for iterative electron tomography reconstruction using graphics processing units (GPUs)

Iterative reconstruction algorithms are becoming increasingly important in electron tomography of biological samples. These algorithms, however, impose major computational demands. Parallelization must be employed to maintain acceptable running times. Graphics Processing Units (GPUs) have been demonstrated to be highly cost-effective for carrying out these computations with a high degree of parallelism. In a recent paper […]

Oct, 8

Programming framework for clusters with heterogeneous accelerators

We describe a programming framework for high performance clusters with various hardware accelerators. In this framework, users can utilize the available heterogeneous resources productively and efficiently. The distributed application is highly modularized to support dynamic system configuration with changing types and number of the accelerators. Multiple layers of communication interface are introduced to reduce the […]

Oct, 8

Astrophysical particle simulations with large custom GPU clusters on three continents

We present direct astrophysical N-body simulations with up to six million bodies using our parallel MPI-CUDA code on large GPU clusters in Beijing, Berkeley, and Heidelberg, with different kinds of GPU hardware. The clusters are linked in the cooperation of ICCS (International Center for Computational Science). We reach about one third of the peak performance […]

Oct, 8

Efficient reconfigurable design for pricing asian options

Arithmetic Asian options are financial derivatives which have the feature of path-dependency: they depend on the entire price path of the underlying asset, rather than just the instantaneous price. This path-dependency makes them difficult to price, as only computationally intensive Monte-Carlo methods can provide accurate prices. This paper proposes an FPGA-accelerated Asian option pricing solution, […]

CUDA

Oct, 7

Multifrontal computations on GPUs and their multi-core hosts

The use of GPUs to accelerate the factoring of large sparse symmetric matrices shows the potential of yielding important benefits to a large group of widely used applications. This paper examines how a multifrontal sparse solver performs when exploiting both the GPU and its multi-core host. It demonstrates that the GPU can dramatically accelerate the […]

CUDA

Oct, 7

Non-recursive beam search on GPU for formal concept analysis

We document a parallel non-recursive beam search GPGPU FCA CbO like algorithm written in nVidia CUDA C and test it on software module dependency graphs. Despite removing repeated calculations and optimising data structures and kernels, we do not yet see major speed ups. Instead GeForce 295 GTX and Tesla C2050 report 141072 concepts (maximal rectangles, […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Computer Vision Models in Surveillance Robotics

Real time ultrasound image denoising

Parallel and efficient Boolean on polygonal solids

Molecular dynamics simulation of UO2 nanocrystals melting

Acceleration of computation speed for elastic wave simulation using a Graphic Processing Unit

Analysis of 3-dimensional electromagnetic fields in dispersive media using cuda

Performance improvements for iterative electron tomography reconstruction using graphics processing units (GPUs)

Programming framework for clusters with heterogeneous accelerators

Astrophysical particle simulations with large custom GPU clusters on three continents

Efficient reconfigurable design for pricing asian options

Multifrontal computations on GPUs and their multi-core hosts

Non-recursive beam search on GPU for formal concept analysis

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)