5879

Posts

Oct, 4

Berkeley Dwarfs on CUDA

Graphics processing units (GPUs) greatly improved their performance over the last ten years. The first graphics cards have been developed in the late 90’s and were targeted for the mass market. These first cards were special purpose hardware, designed to accelerate graphic processing required in computer games. As the interest in computer games continued, GPU […]
Oct, 4

Comparing Parallel Simulation of Social Agents using Cilk and OpenCL

Recent advances in wireless/mobile communication and body worn sensors, together with ambient intelligence and seamless integrated pervasive technology have paved the way for applications operating based on social signals, i. e., sensing and processing of group behavior, interpersonal relationships, or emotions. Thinking in large, it should be apparent that modeling social systems allowing to study […]
Oct, 4

Optimization of the Gaussian Mixture Model Evaluation on GPU

In this paper we present a highly optimized implementation of Gaussian mixture acoustic model evaluation algorithm. Evaluation of these likelihoods is one of the most computationally intensive parts of automatics speech recognizers but it can be well-parallelized and offloaded to GPU devices. Our approach offers significant speed-up compared to the recently published approaches, since it […]
Oct, 4

Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL

We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance raytracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We […]
Oct, 3

Tranformation of CPU-based Applications To Leverage on Graphics Processors using CUDA

Scientific computation requires a great amount of computing power especially in floating-point operation but a high-end multi-cores processor is currently limited in terms of floating point operation performance and parallelization. Recent technological advancement has made parallel computing technically and financially feasible using Compute Unified Device Architecture (CUDA) developed by NVIDIA. This research focuses on measuring […]
Oct, 3

Parallel Game Tree Search Using GPU

Parallel performance of graphics cards in desktop computers generally outreaches performance of conventional processors. The purpose of this paper is to identify possibilities of tasks parallelization when searching and evaluating game trees and to propose algorithms that would perform better on SIMD processors of graphics cards than on regular desktop processors. On proposed algorithms’ basis […]
Oct, 3

Implementation of the optimization algorithms on GPGPU architecture and multi-cores

This bibliography study mainly synthesize the key ideas of the parallel architectures, neural network models, and discuss the implementation algorithm design methods that will be used on the GPGPU and multicores to realize the optimizations. Since the neural network computational models are regarded as valuable tools to solve many scientific and practical problems, and it […]
Oct, 3

GPU-Accelerated DNA Distance Matrix Computation

Distance matrix calculation used in phylogeny analysis is computational intensive. The growing sequences data sets necessitate fast computation method. This paper accelerate Felsenstein’s DNADIST program by using OpenCL to exploit the great computation capability of graphic card. The GPUaccelerated DNADIST program achieves more than 12-fold speedup over the serial CPU program on a personal workstation […]
Oct, 3

Parallel SAT-Solving with OpenCL

In the last few decades there have been substantial improvements in approaches for solving the Boolean satisfiability problem. Many of these improvements consisted in elaborating on existing algorithms. On the side of the complete solvers this led to more efficient branching heuristics and the use of watched literals for unit propagation; incomplete solvers on the […]
Oct, 3

Heterogeneous Computing with OpenCL

Heterogeneous Computing with OpenCL teaches OpenCL and parallel programming for complex systems that may include a variety of device architectures: multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units (APUs) such as AMD Fusion technology. Designed to work on multiple platforms and with wide industry support, OpenCL will help you more effectively program for a heterogeneous […]
Oct, 3

An OpenCL Fast Fourier Transformation

This paper describes an implementation strategy in preparation for an implementation of an OpenCL FFT. The two most essential factors (memory bandwidth and locality) that are crucial to obtain high performance on a GPU for an FFT implementation are highlighted. Theoretical upper bounds for performance in terms of the locality factor are derived. An implementation […]
Oct, 3

Realtime Computation of a VST Audio Effect Plugin on the Graphics Processor

A plugin system for GPGPU real time audio effect calculation on the graphics processing unit of the computer system is presented. The prototype application is the rendering of mono audio material with head-related transfer functions (HRTFs) to create the impression of a sound source located in a certain direction relative to the listener’s head. The […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: