7123

Posts

Jan, 25

Parallel Algorithm Design and Implementation of Regular/Irregular Problems: An In-depth Performance Study on Graphics Processing Units

Recently, interest in the Graphics Processing Unit (GPU) for general purpose parallel applications development and research has grown. Much of the current research on the GPU focuses on the acceleration of regular problems, as irregular problems typically do not provide the same level of performance on the hardware. We explore the potential of the GPU […]
Jan, 25

PyCOOL – a Cosmological Object-Oriented Lattice code written in Python

There are a number of different phenomena in the early universe that have to be studied numerically with lattice simulations. This paper presents a graphics processing unit (GPU) accelerated Python program called PyCOOL that solves the evolution of scalar fields in a lattice with very precise symplectic integrators. The program has been written with the […]
Jan, 25

Realtime scheduling using GPUs – proof of feasibility

This paper will report our evaluation to use openCL as a platform for hard realtime scheduling. Specifically, we have evaluated which types of tasks are faster on GPGPU than on CPU. We have investigated computational tasks, memory intensive tasks (especially tasks using low latency GDDR memory) and disk intensive tasks. This study is the first […]
Jan, 25

GPU algorithms for comparison-based sorting and for merging based on multiway selection

Sorting and merging are two important kernels which are used as subroutines in numerous algorithms, whose performance depends on the efficiency of these primitives. Databases use sort and merge primitives extensively. Computational biology, search engines, realtime rendering and geographical information systems are other fields where sorting and merging large amounts of data is indispensable. Even […]
Jan, 25

Computational Fluid Dynamics using OpenCL – a Practical Introduction

The main aim of the Computational Fluid Dynamics (CFD) simulations is to reconstruct the reality of fluid motion and behaviour as accurately as possible in order to better understand the natural phenomena under specified conditions. Ideally, general solutions can also cover different scales and geometric configurations. Unfortunately, due to expensive algorithms, classic CFD codes most […]
Jan, 25

Solving Bivariate Polynomial Systems on a GPU

We present a CUDA implementation of dense multivariate polynomial arithmetic based on Fast Fourier Transforms over finite fields. Our core routine computes on the device (GPU) the subresultant chain of two polynomials with respect to a given variable. This subresultant chain is encoded by values on a FFT grid and is manipulated from the host […]
Jan, 24

The GPU Enhanced Parallel Computing for Large Scale Data Clustering

Analyzing and clustering large scale data set is a complex problem. One explored method of solving this problem borrows from nature, imitating the flocking behavior of birds. One limitation of this method of data clustering is its complexity O(n^2). As the number of data and feature dimensions grows, it becomes increasingly difficult to generate results […]
Jan, 24

GPApriori: GPU-Accelerated Frequent Itemset Mining

In this paper we describe GPA priori, a GPU-accelerated implementation of Frequent Item set Mining (FIM). We tested our implementation with an Nvidia Tesla T10 graphic processor and demonstrate up to 100x speedup as compared with several state-of-the-art FIM algorithms on a CPU. In order to map the Apriori algorithm onto the SIMD execution model, […]
Jan, 24

Designing Fast LTL Model Checking Algorithms for Many-Core GPUs

Recent technological developments made various many-core hardware platforms widely accessible. These massively parallel architectures have been used to significantly accelerate many computation demanding tasks. In this paper, we show how the algorithms for LTL model checking can be redesigned in order to accelerate LTL model checking on many-core GPU platforms. Our detailed experimental evaluation demonstrates […]
Jan, 24

Real-Time Ultrasound Biomicroscopy with Optoacoustic Arrays

Optical techniques are a promising technology to realize high frequency ultrasound arrays. High sensitivity and broad bandwidth have been demonstrated with optoacoustic sensors based on thin film etalons. A thin film etalon consists of a transparent layer (e.g. photoresist or parylene) with gold coatings on a glass substrate. One-dimensional (1-D) data acquisition is realized by […]
Jan, 24

Real-Time Photon Mapping on GPU

This paper presents a hybrid photon-mapping approach for global illumination. It represents a significant improvement over a previously described approach, both with respect to speed and accuracy. Using OptiX for ray tracing provides a considerable improvement in the speed of ray tracing and would keep synchronization to a minimum by using texture memory to cache […]
Jan, 24

Multipattern String Matching On A GPU

We develop GPU adaptations of the Aho-Corasick string matching algorithm for the the case when all data reside initially in the GPU memory and the results are to be left in this memory. We consider several refinements to a base GPU implementation and measure the performance gain from each refinement. Experiments conducted on an NVIDIA […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: