4794

Posts

Jul, 10

Implementation of stereophonic acoustic echo canceller on nVIDIA GeForce graphics processing unit

This paper presents an implementation of a stereophonic acoustic echo canceller on nVIDIA GeForce graphics processor and CUDA software development environment. For efficiency, fast shared memory has been used as much as possilbe. A tree adder is introduced to reduce the cost for summing thread outputs up. The performance evaluation results suggest that Even a […]
Jul, 10

Hardware-assisted visibility sorting for unstructured volume rendering

Harvesting the power of modern graphics hardware to solve the complex problem of real-time rendering of large unstructured meshes is a major research goal in the volume visualization community. While, for regular grids, texture-based techniques are well-suited for current GPUs, the steps necessary for rendering unstructured meshes are not so easily mapped to current hardware. […]
Jul, 10

FPGA and GPU implementation of large scale SpMV

Sparse matrix-vector multiplication (SpMV) is a fundamental operation for many applications. Many studies have been done to implement the SpMV on different platforms, while few work focused on the very large scale datasets with millions of dimensions. This paper addresses the challenges of implementing large scale SpMV with FPGA and GPU in the application of […]
Jul, 9

Implementation of usual computerized tomography methods on GPU using the Compute Unified Device Architecture (CUDA)

CUDA (Compute Unified Device Architecture) is an efficient architecture developed by NVIDIA to compute parallel algorithms on Graphic Processing Unit (GPU). Using the API associated with this architecture, we develop fast parallel algorithms to compute standard methods for computerized tomography. Computation times are compared to their similar implementations on CPU to illustrate the efficiency of […]
Jul, 9

GPU implementation of volume reconstruction and object detection in Digital Holographic Microscopy

Using Digital Holographic Microscopy (DHM) we can gather information from a whole volume and thus we can avoid the small depth of field constraint of the conventional microscopes. This way a volume inspection system can be constructed, which is capable to find, segment, collect, and later classify those objects that flow through an inspection chamber. […]
Jul, 9

Power and Performance Characterization of Computational Kernels on the GPU

Nowadays Graphic Processing Units (GPU) are gaining increasing popularity in high performance computing (HPC). While modern GPUs can offer much more computational power than CPUs, they also consume much more power. Energy efficiency is one of the most important factors that will affect a broader adoption of GPUs in HPC. In this paper, we systematically […]
Jul, 9

The use of overlapping subgrids to accelerate the FDTD on GPU devices

The method Finite Difference Time Domain (FDTD) is widely used in electromagnetic simulations to solve problems of microwave tomography, radar and telecommunications. Since this method is a data intensive and computation intensive problem, there are a lot of initiatives to improve the scalability and the performance of the FDTD. Despite the progress, performance in FDTD […]
Jul, 9

Accelerating data clustering on GPU-based clusters under shared memory abstraction

Many-core graphics processors are playing today an important role in the advancements of modern highly concurrent processors. Their ability to accelerate computation is being explored under several scientific fields. In the current paper we present the acceleration of a widely used data clustering algorithm, K-means, in the context of high performance GPU clusters. As opposed […]
Jul, 9

Numerical Parallel Processing Based on GPU with CUDA Architecture

The characteristics of modern graphics processing unit (GPU) is programmable, high price / performance ratio and high speed. It has a strong ability to adapt the parallel calculation, Based on this, the article study the general method of GPU calculating and use compute unified device architecture (CUDA) to design new parallel algorithm to accelerate the […]
Jul, 9

A massively parallel implementation of QC-LDPC decoder on GPU

The graphics processor unit (GPU) is able to provide a low-cost and flexible software-based multi-core architecture for high performance computing. However, it is still very challenging to efficiently map the real-world applications to GPU and fully utilize the computational power of GPU. As a case study, we present a GPU-based implementation of a real-world digital […]
Jul, 9

Utilization of GPU for real-time vision in robotics

The paper focuses on the FraDIA vision subsystem part responsible for GPU-based image processing. The developed set of classes encapsulates the OpenCL subroutines and utilizes GPU to fulfill the robotic requirements for real-time visual data processing. The class structure reflects the proposed classification of image processing algorithms.
Jul, 9

GPU volume rendering in 3D echocardiography: Real-time pre-processing and ray-casting

Since real-time acquisition of 3D echocardiographic data is achievable in practice, many volume rendering algorithms have been proposed for visualization purposes. However, due to the large amounts of data and computations involved a tradeoff between image quality and computational efficiency has to be made. The main goal of our study was to generate high quality […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: