10094

Posts

Jul, 13

Optimized MFCC Feature Extraction on GPU

In this paper, we update our previous research for Mel-Frequency Cepstral Coefficient (MFCC) feature extraction [1] and describe the optimizations required for improving throughput on the Graphics Processing Units (GPU). We not only demonstrate that the feature extraction process is suitable for GPUs and a substantial reduction in computation time can be obtained by performing […]
Jul, 13

GPU Simulation of Radiation in Matter

Parallel programming on GPUs is introduced in the context of simulating collision energy loss and bremsstrahlung for charged particles propagating in matter. The employed Monte Carlo methods and the involved physics are presented, followed by an introduction to the concepts of GPU parallel programming for the Nvidia CUDA architecture. The simulations implemented in C++ and […]
Jul, 13

An acceleration of the algorithm for the nurse rerostering problem on a graphics processing unit

This paper deals with the Nurse Rerostering Problem (NRRP) performed by a parallel algorithm on a Graphics Processing Unit (GPU). This problem is focused on rescheduling of human resources in healthcare, when a roster is disrupted by unexpected circumstances. Our aim is to resolve NRRP in a parallel way to shorten the needed computational time […]
Jul, 12

Parallel Graph Processing on Graphics Processors Made Easy

This paper demonstrates Medusa, a programming framework for parallel graph processing on graphics processors (GPUs). Medusa enables developers to leverage the massive parallelism and other hardware features of GPUs by writing sequential C/C++ code for a small set of APIs. This simplifies the implementation of parallel graph processing on the GPU. The runtime system of […]
Jul, 12

OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures

Driven by the rapid hardware development of parallel CPU/GPU architectures, we have witnessed emerging relational query processing techniques and implementations on those parallel architectures. However, most of those implementations are not portable across different architectures, because they are usually developed from scratch and target at a specific architecture. This paper proposes a kernel-adapter based design […]
Jul, 12

Fast PCA-BAsed Face Recognition on GPUs

Face recognition is very important in many applications including surveillance, biometrics, and other domains. Fast face recognition is required if she wants to train or test more images or to increase the resolution of an input image for better accuracy in the recognition. Meanwhile, Graphics Processing Units (GPUs) have become widely available, offering the opportunity […]
Jul, 12

Hidden Surface Removal Using BSP Tree with CUDA

Binary Space Partitioning (BSP) Tree can be used for hidden surface removal. In order to hide invisible surfaces, all surfaces are sorted back to front or front to back order. Traversal of BSP Trees for back to front order of faces requires calculation for all BSP Tree nodes, which can be made in parallel manner. […]
Jul, 12

A comparison of period finding algorithms

This paper presents a comparison of popular period finding algorithms applied to the light curves of variable stars from the Catalina Real-time Transient Survey (CRTS), MACHO and ASAS data sets. We analyze the accuracy of the methods against magnitude, sampling rates, quoted period, quality measures (signal-to-noise and number of observations), variability, and object classes. We […]
Jul, 12

Design Space Exploration of Real-time Bedside and Portable Medical Ultrasound Adaptive Beamformer Acceleration

This work explored the design considerations on the real-time medical ultrasound adaptive beamformer implementations using different computing platforms: CPU, GPU and FPGA. Adaptive beamforming has been well considered as an advanced solution for improving the image quality of medical ultrasound imaging machines. Although it provides promising improvements in lateral resolution, image contrast and imaging penetration […]
Jul, 12

Feature Tracking in Time-Varying Volumetric Data through Scale Invariant Feature Transform

Recent advances in medical imaging technology enable dynamic acquisitions of objects under movement. The acquired dynamic data has shown to be useful in different application scenarios. However, the vast amount of time-varying data put a great demand on robust and efficient algorithms for extracting and interpreting the underlying information. In this paper, we present a […]
Jul, 12

A GPGPU-based Pipeline for Accelerated Rendering of Point Clouds

Direct rendering of large point clouds has become common practice in architecture and archaeology in recent years. Due to the high point density no meshes are reconstructed from the scanning data, but the points can be rendered directly as primitives of a graphics API like OpenGL. However, these APIs and the hardware, which they are […]
Jul, 12

SIMD Divergence Optimization through Intra-Warp Compaction

SIMD execution units in GPUs are increasingly used for high performance and energy efficient acceleration of general purpose applications. However, SIMD control flow divergence effects can result in reduced execution efficiency in a class of GPGPU applications, classified as divergent applications. Improving SIMD efficiency, therefore, has the potential to bring significant performance and energy benefits […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org