9146

Posts

Mar, 21

Stream Join Processing on Heterogeneous Processors

The window-based stream join is an important operator in all data streaming systems. It has often high resource requirements so that many efficient sequential as well as parallel versions of it were proposed in the literature. The parallel stream join operators recently gain increasing interest because hardware is getting more and more parallel. Most of […]
Mar, 20

Symbolic Crosschecking of Data-Parallel Floating Point Code

In this thesis we present a symbolic execution-based technique for cross-checking programs accelerated using SIMD or OpenCL against an unaccelerated version, as well as a technique for detecting data races in OpenCL programs. Our techniques are implemented in KLEE-CL, a symbolic execution engine based on KLEE that supports symbolic reasoning on the equivalence between expressions […]
Mar, 20

A CUDA-Based Cooperative Evolutionary Multi-Swarm Optimization Applied to Engineering Problems

This paper presents a variation of Evolutionary Particle Swarm Optimization applied to the concept of master/slave swarm with mechanism of sharing data for the acceleration of convergence. The implementation called Cooperative Evolutionary MultiSwarm Optimization on Graphics Processing Units (CMEPSOGPU) consists in using thousands of threads in various slave swarms on the CUDA parallel architecture, where […]
Mar, 20

Multi-GPU Island-Based Genetic Algorithm

Genetic algorithms are effective in solving many optimization tasks. However, the long execution time associated with it prevents its use in many domains. In this paper, we propose a new approach for parallel implementation of genetic algorithm on graphics processing units (GPUs) using CUDA programming model. This paper introduces a novel implementation of the genetic […]
Mar, 20

Time-stepping methods for the simulation of the self-assembly of nano-crystals in Matlab on a GPU

Partial differential equations describing the patterning of thin crystalline films are typically of fourth or sixth order, they are quasi- or semilinear and they are mostly defined on simple geometries such as rectangular domains. For the numerical simulation of these kind of problems spectral methods are an efficient approach. We apply several implicit-explicit schemes to […]
Mar, 20

General Purpose Computing on Low-Power Embedded GPUs: Has It Come of Age?

In this paper we evaluate the promise held by lowpower GPUs for non-graphic workloads that arise in embedded systems. Towards this, we map and implement 5 benchmarks, that find utility in very different application domains, to an embedded GPU. Our results show that apart from accelerated performance, embedded GPUs are promising also because of their […]
Mar, 18

clMAGMA: High Performance Dense Linear Algebra with OpenCL

This paper presents the design and implementation of several fundamental dense linear algebra (DLA) algorithms in OpenCL. In particular, these are linear system solvers and eigenvalue problem solvers. Further, we give an overview of the clMAGMA library, an open source, high performance OpenCL library that incorporates the developments presented, and in general provides to heterogeneous […]
Mar, 18

Volume Raycasting Performance Using DirectCompute

Volume rendering is quite an old concept of representing images, dating back to the 1980’s. It is very useful in the medical field for visualizing the results of a computer tomography (CT) and magnet resonance tomography (MRT) in 3D. Apart from these two major applications for volume rendering, there aren’t many other fields of usage […]
Mar, 18

GPU-Based Cloud Service for Smith-Waterman algorithm Using Frequency Distance Filtration Scheme

As the conventional means of analyzing the similarity between a query sequence and database sequences, the Smith-Waterman algorithm is feasible for a database search owing to its high sensitivity. However, this algorithm is still quite time consuming. CUDA programming can improve computations efficiently by using the computational power of massive computing hardware as graphics processing […]
Mar, 18

Parallelization with Different API on Multicore Architecture

Soft matter as a research topic extends over fields from a multitude of disciplines. Biological systems are nearly exclusively composed of soft matter. Nearly everything that animals eat is considered soft matter. Large parts of chemistry deal with soft matter, such as the whole field of polymers. Many materials, expecially modern ones, are soft matter. […]
Mar, 18

GPU Accelerated Multiple Deoxyribose Nucleic Acid Sequence Parallel Matching

In this paper, a contrastive evaluation of massive parallel implementations of suffix tree and suffix array to accelerate genome sequence matching are proposed based on Intel Core i7 3770K quad-core and NVIDIA GeForce GTX680 GPU(kepler architecture). Due to the more regular execution flow of the indexed binary search algorithm, the more efficient use of the […]
Mar, 16

Accelerating Computer Vision Algorithms Using OpenCL on Mobile GPU – A Case Study

Recently, general-purpose computing on graphics processing units (GPGPU) has been enabled on mobile devices thanks to the emerging heterogeneous programming models such as OpenCL. The capability of GPGPU on mobile devices opens a new era for mobile computing and can enable many computationally demanding computer vision algorithms on mobile devices. As a case study, this […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org