8663

Posts

Nov, 29

Using GPUs for Realtime Prediction of Optical Forces on Microsphere Ensembles

Laser beams can be used to create optical traps that can hold and transport small particles. Optical trapping has been used in a number of applications ranging from prototyping at the microscale to biological cell manipulation. Successfully using optical tweezers requires predicting optical forces on the particle being trapped and transported. Reasonably accurate theory and […]
Nov, 29

Parallel Nonbinary LDPC Decoding on GPU

Nonbinary Low-Density Parity-Check (LDPC) codes are a class of error-correcting codes constructed over the Galois field GF(q) for q > 2. As extensions of binary LDPC codes, nonbinary LDPC codes can provide better error-correcting performance when the code length is short or moderate, but at a cost of higher decoding complexity. This paper proposes a […]
Nov, 29

clOpenCL – Supporting Distributed Heterogeneous Computing in HPC Clusters

Clusters that combine heterogeneous compute device architectures, coupled with novel programming models, have created a true alternative to traditional (homogeneous) cluster computing, allowing to leverage the performance of parallel applications. In this paper we introduce clOpenCL, a platform that supports the simple deployment and efficient running of OpenCL-based parallel applications that may span several cluster […]
Nov, 29

GPU Ray Tracing – Comparative Study of Ray-Triangle Intersection Algorithms

We present a comparative study of GPU ray tracing implemented for two different types of ray-triangle intersection algorithms used with BVH (Bounding Volume Hierarchy) spatial data structure evaluated for performance on three static scenes. We study how number of triangles placed in a BVH leaf node affects rendering performance. We propose GPU-optimized SIMD ray-triangle intersection […]
Nov, 28

Formal Semantics of Heterogeneous CUDA-C: A Modular Approach with Applications

We extend an off-the-shelf, executable formal semantics of C (Ellison and Rosu’s K Framework semantics) with the core features of CUDA-C. The hybrid CPU/GPU computation model of CUDA-C presents challenges not just for programmers, but also for practitioners of formal methods. Our formal semantics helps expose and clarify these issues. We demonstrate the usefulness of […]
Nov, 28

Applying Contact Angle to a Two-Dimensional Smoothed Particle Hydrodynamics (SPH) model on a Graphics Processing Unit (GPU) Platform

A parallel GPU compatible Lagrangian mesh free particle solver for multiphase fluid flow based on SPH scheme is developed and used to capture the interface evolution during droplet impact. Surface tension is modeled employing the multiphase scheme of Hu et al. (2006). In order to precisely simulate the wetting phenomena, a method based on the […]
Nov, 28

Chest CT automatic analysis for lung nodules detection implemented on a GPU computing system

The aim of this work is the efficient implementation of the Hessian based filters. These filters are commonly used in medical image analysis and are employed in the Voxel Based Neural Approach (VBNA) lung CAD (Computer Aided Detection) system for lung nodule detection. This work mainly focuses on the optimization of the filter devoted to […]
Nov, 28

Short-time Fourier transform laser Doppler holography

We report a demonstration of laser Doppler holography at a sustained acquisition rate of 250 Hz on a 1 Megapixel complementary metal-oxide-semiconductor (CMOS) sensor array and image display at 10 Hz frame rate. The holograms are optically acquired in off-axis configuration, with a frequency-shifted reference beam. Wide-field imaging of optical fluctuations in a 250 Hz […]
Nov, 27

Softshell: Dynamic Scheduling on GPUs

In this paper we present Softshell, a novel execution model for devices composed of multiple processing cores operating in a single instruction, multiple data fashion, such as graphics processing units (GPUs). The Softshell model is intuitive and more flexible than the kernel-based adaption of the stream processing model, which is currently the dominant model for […]
Nov, 27

From Parallel Programs to Customized Parallel Processors

The need for fast time to market of new embedded processor-based designs calls for a rapid design methodology of the included processors. The call for such a methodology is even more emphasized in the context of so called soft cores targeted to reconfigurable fabrics where per-design processor customization is commonplace. The C language has been […]
Nov, 27

Optimizing 3D Convolutions for Wavelet Transforms on CPUs with SSE Units and GPUs

Nanosimulations present a big HPC challenge as they present increasing performance demands in heterogeneous execution environments. In this paper, we present our optimization methodology for BigDFT, a nanosimulation software using Density Functional Theory. We explore autotuning possibilities for BigDFT’s 3D convolutions by studying optimization techniques for several architectures. Namely, we focus on processors with vector […]
Nov, 27

A Data-Parallel Algorithmic Modelica Extension for Efficient Execution on Multi-Core Platforms

New multi-core CPU and GPU architectures promise high computational power at a low cost if suitable computational algorithms can be developed. However, parallel programming for such architectures is usually non-portable, low-level and error-prone. To make the computational power of new multi-core architectures more easily available to Modelica modelers, we have developed the ParModelica algorithmic language […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: