8400

Posts

Oct, 6

An Efficient GPU Implementation of Modified Discrete Cosine Transform Using CUDA

A new method is presented in this paper for using general purpose programming tools of graphics processing units. It aims to calculate the modified discrete cosine transform in audio coding and compression algorithms for popular audio formats such as MP3, AAC/AC-3, and WMA. The proposed algorithm consists of matrix multiplications that are performed by the […]
Oct, 5

Accelerated protein structure comparison using TM-score-GPU

MOTIVATION: Accurate comparisons of different protein structures play important roles in structural biology, structure prediction and functional annotation. The root-mean-square-deviation (RMSD) after optimal superposition is the predominant measure of similarity due to the ease and speed of computation. However, global RMSD is dependent on the length of the protein and can be dominated by divergent […]
Oct, 5

Using Commodity Coprocessors for Host Intrusion Detection

The ever-rising importance of communication services and devices emphasizes the significance of intrusion detection. Besides general network attacks, private hosts in particular are within the focus of cyber criminals. Private data theft and the integration of individual hosts into large-scale botnets are two common purposes successfully subverted systems are used for. In order to detect […]
Oct, 5

Is the game worth the candle? Evaluation of OpenCL for object detection algorithm optimization

In this paper we present out experiences with the implementation of an object detector using OpenCL. With this implementation we fullfil the need for fast and robust object detection, necessary in many applications in multiple domains (surveillance, traffic, image retrieval, …). The algorithm lends itself to be implemented in a parallel way. We exploit this […]
Oct, 5

An ultrasonic imaging system based on a new SAFT approach and a GPU beamformer

The design of newer ultrasonic imaging systems attempts to obtain low-cost, small-sized devices with reduced power consumption that are capable of reaching high frame rates with high image quality. In this regard, synthetic aperture techniques have been very useful. They reduce hardware requirements and accelerate information capture. However, the beamforming process is still very slow, […]
Oct, 5

Parallel Algorithm of IDCT with GPUs and CUDA for Large-scale Video Quality of 3G

When video is transmitted over 3G networks, the video quality might suffer from impairments caused by packet losses. Extracting video quality features is a set of algorithms and inverse discrete cosine transforms is an important algorithm in this field. To improve the performance and be suitable to apply to evaluating the 3G video quality in […]
Oct, 4

Redefining the Role of the CPU in the Era of CPU-GPU Integration

GPU computing has emerged as a viable alternative to CPUs for throughput oriented applications or regions of code. Speedups of 10 to 100x over CPU implementations have been reported. This trend is expected to continue in the future with GPU architectural advances, improved programming support, scaling, and tighter CPU-GPU chip integration. However, not all code […]
Oct, 4

Heterogeneous GPU&CPU cluster for High Performance Computing in cryptography

This paper addresses issues associated with distributed computing systems and the application of mixed GPU&CPU technology to data encryption and decryption algorithms. We describe a heterogenous cluster HGCC formed by two types of nodes: Intel processor with NVIDIA graphics processing unit and AMD processor with AMD graphics processing unit (formerly ATI), and a novel software […]
Oct, 4

Computing resultants on Graphics Processing Units: Towards GPU-accelerated computer algebra

In this article we report on our experience in computing resultants of bivariate polynomials on Graphics Processing Units (GPU). Following the outline of Collins’ modular approach [6], our algorithm starts by mapping the input polynomials to a finite field for sufficiently many primes m. Next, the GPU algorithm evaluates the polynomials at a number of […]
Oct, 4

Parallel computations on GPU in 3D using the vortex particle method

The paper presented theVortexin Cell (VIC) method for solving the fluid motion equations in3D and its implementation for parallelcomputationin multicore architecture of the Graphics Processing Unit (GPU). One of the most important components of the VIC method algorithm is the solution of the Poisson equation. Multigrid and full multigrid methods were chosen for its solution […]
Oct, 4

Towards accelerating Smoothed Particle Hydrodynamics simulations for free-surface flows on multi-GPU clusters

Starting from the single graphics processing unit (GPU) version of the Smoothed Particle Hydrodynamics (SPH) code DualSPHysics, a multi-GPU SPH program is developed for free-surface flows. The approach is based on a spatial decomposition technique, whereby different portions (sub-domains) of the physical system under study are assigned to different GPUs. Communication between devices is achieved […]
Oct, 3

KERNELGEN – A Toolchain for Automatic GPU-centric Applications Porting

KernelGen is a toolchain for porting existing source code on the GPU, that does not involve inserting annotations or manual kernels programming, but instead moves as much target source on the GPU, as possible, enabling automatic adaptation of large codebase, e.g. numerical models. Separate kernels are generated for parallel loops, and the rest of the […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: