Posts
Oct, 3
Tranformation of CPU-based Applications To Leverage on Graphics Processors using CUDA
Scientific computation requires a great amount of computing power especially in floating-point operation but a high-end multi-cores processor is currently limited in terms of floating point operation performance and parallelization. Recent technological advancement has made parallel computing technically and financially feasible using Compute Unified Device Architecture (CUDA) developed by NVIDIA. This research focuses on measuring […]
Oct, 3
Parallel Game Tree Search Using GPU
Parallel performance of graphics cards in desktop computers generally outreaches performance of conventional processors. The purpose of this paper is to identify possibilities of tasks parallelization when searching and evaluating game trees and to propose algorithms that would perform better on SIMD processors of graphics cards than on regular desktop processors. On proposed algorithms’ basis […]
Oct, 3
Implementation of the optimization algorithms on GPGPU architecture and multi-cores
This bibliography study mainly synthesize the key ideas of the parallel architectures, neural network models, and discuss the implementation algorithm design methods that will be used on the GPGPU and multicores to realize the optimizations. Since the neural network computational models are regarded as valuable tools to solve many scientific and practical problems, and it […]
Oct, 3
GPU-Accelerated DNA Distance Matrix Computation
Distance matrix calculation used in phylogeny analysis is computational intensive. The growing sequences data sets necessitate fast computation method. This paper accelerate Felsenstein’s DNADIST program by using OpenCL to exploit the great computation capability of graphic card. The GPUaccelerated DNADIST program achieves more than 12-fold speedup over the serial CPU program on a personal workstation […]
Oct, 3
Parallel SAT-Solving with OpenCL
In the last few decades there have been substantial improvements in approaches for solving the Boolean satisfiability problem. Many of these improvements consisted in elaborating on existing algorithms. On the side of the complete solvers this led to more efficient branching heuristics and the use of watched literals for unit propagation; incomplete solvers on the […]
Oct, 3
Heterogeneous Computing with OpenCL
Heterogeneous Computing with OpenCL teaches OpenCL and parallel programming for complex systems that may include a variety of device architectures: multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units (APUs) such as AMD Fusion technology. Designed to work on multiple platforms and with wide industry support, OpenCL will help you more effectively program for a heterogeneous […]
Oct, 3
An OpenCL Fast Fourier Transformation
This paper describes an implementation strategy in preparation for an implementation of an OpenCL FFT. The two most essential factors (memory bandwidth and locality) that are crucial to obtain high performance on a GPU for an FFT implementation are highlighted. Theoretical upper bounds for performance in terms of the locality factor are derived. An implementation […]
Oct, 3
Realtime Computation of a VST Audio Effect Plugin on the Graphics Processor
A plugin system for GPGPU real time audio effect calculation on the graphics processing unit of the computer system is presented. The prototype application is the rendering of mono audio material with head-related transfer functions (HRTFs) to create the impression of a sound source located in a certain direction relative to the listener’s head. The […]
Oct, 3
Towards robust automatic detection of vulnerable road users: monocular pedestrian tracking from a moving vehicle
In this paper we present steps towards the automatic detection of vulnerable road users in video. Such a system can e.g. be used as an automatic blind spot camera for trucks. The aim of the system is to automatically warn the driver when the algorithm detects vulnerable road users in the camera images. Such an […]
Oct, 3
An Auto-tuning Solution to Data Streams Clustering in OpenCL
Due to its applicability to numerous types of data, including telephone records, web documents, and click streams, the data stream model has recently attracted attention. For analysis of such data, it is crucial to process the data in a single pass, or a small number of passes, using little memory. This paper provides an OpenCL […]
Oct, 2
A New Class of Parallel Scheduling Algorithms
The main issue discussed in this book is concerned with solving job scheduling problems in parallel calculating environments, such as multiprocessor computers, clusters or distributed calculation nodes in networks, by applying algorithms which use various parallelization technologies starting from multiple calculation threads (multithread technique) up to distributed calculation processes. Strongly sequential character of the scheduling […]
Oct, 2
Hardware/Software Co-design for Energy-Efficient Seismic Modeling
Reverse Time Migration (RTM) has become the standard for high-quality imaging in the seismic industry. RTM relies on PDE solutions using stencils that are 8th order or larger, which require large-scale HPC clusters to meet the computational demands. However, the rising power consumption of conventional cluster technology has prompted investigation of architectural alternatives that offer […]