6557

Posts

Dec, 5

A dynamic scheduling runtime and tuning system for heterogeneous multi and many-core desktop platforms

A modern personal computer can be now considered as a one-node heterogeneous cluster that simultaneously processes several applications’ tasks. It can be composed by asymmetric Processing Units (PUs), like the multi-core Central Processing Unit (CPU), the many-core Graphics Processing Units (GPUs) – which have become one of the main co-processors that contributed towards high performance […]
Dec, 5

GPU-Euler: Sequence Assembly Using GPGPU

Advances in sequencing technologies have revolutionized the field of genomics by providing cost effective and high throughput solutions. In this paper, we develop a parallel sequence assembler implemented on general purpose graphic processor units (GPUs). Our work was largely motivated by a growing need in the genomic community for sequence assemblers and increasing use of […]
Dec, 5

Scalable Query Evaluation in Relational Databases

The scalability of a query depends on the amount of data that needs to be accessed when computing the answer. This implies three immediate general strategies for improving query performance: decrease the amount of data (including intermediate results) to be accessed by accessing it smarter; decrease the amount by simply reducing the data quantity in […]
Dec, 5

A New Tool for Classification of Satellite Images Available from Google Maps: Efficient Implementation in Graphics Processing Units

In this work, we develop a new parallel implementation of the k-means unsupervised clustering algorithm for commodity graphic processing units (GPUs), and further evaluate the performance of this newly developed algorithm in the task of classifying (in unsupervised fashion) satellite imagery available from Google Maps engine. With the ultimate goal of evaluating the classification precision […]
Dec, 5

GROPHECY: GPU performance projection from CPU code skeletons

We propose GROPHECY, a GPU performance projection framework that can estimate the performance benefit of GPU acceleration without actual GPU programming or hardware. Users need only to skeletonize pieces of CPU code that are targets for GPU acceleration. Code skeletons are automatically transformed in various ways to mimic tuned GPU codes with characteristics resembling real […]
Dec, 4

GPU-Based Parallel Multi-objective Particle Swarm Optimization

In the recent years, multi-objective particle swarm optimization (MOPSO) has become quite popular in the field of multi-objective optimization. However, due to a large amount of fitness evaluations as well as the task of archive maintaining, the running time of MOPSO for optimizing some difficult problems may be quite long. This paper proposes a parallel […]
Dec, 4

Communication-avoiding QR decomposition for GPUs

We describe an implementation of the Communication-Avoiding QR (CAQR) factorization that runs entirely on a single graphics processor (GPU). We show that the reduction in memory traffic provided by CAQR allows us to outperform existing parallel GPU implementations of QR for a large class of tall-skinny matrices. Other GPU implementations of QR handle panel factorizations […]
Dec, 4

Sequence Homology Search using Fine-Grained Cycle Sharing of Idle GPUs

In this paper, we propose a fine-grained cycle sharing (FGCS) system capable of exploiting idle graphics processing units (GPUs) for accelerating sequence homology search in local area network environments. Our system exploits short idle periods on GPUs by running small parts of guest programs such that each part can be completed within hundreds of milliseconds. […]
Dec, 4

kNN Query Processing in Metric Spaces Using GPUs

Information retrieval from large databases is becoming crucial for many applications in different fields such as content searching in multimedia objects, text retrieval or computational biology. These databases are usually indexed off-line to enable an acceleration of on-line searches. Furthermore, the available parallelism has been exploited using clusters to improve query throughput. Recently some authors […]
Dec, 4

Computing Strongly Connected Components in Parallel on CUDA

The problem of decomposing a directed graph into its strongly connected components is a fundamental graph problem inherently present in many scientific and commercial applications. In this paper we show how some of the existing parallel algorithms can be reformulated in order to be accelerated by NVIDIA CUDA technology. In particular, we design a new […]
Dec, 4

Implementing CFD (Computational Fluid Dynamics) in OpenCL for Building Simulation

Though researchers in computer graphics have started to use the GPGPU (General Purposed Graphics Processing Unit) method to speed up their procedural programs, these techniques are seldom used in the building simulation field. It is possible to apply the GPGPU method to many simulation scenarios (i.e. human evacuation, shadow simulation) to speed up performance. In […]
Dec, 4

Global Point Mascon Models for Simple, Accurate and Parallel Geopotential Computation

High-fidelity geopotential calculation using spherical harmonics (SH) is expensive and relies on recursive non-parallel relations. Here, a global point mascon (PMC) model is proposed that is memory light, extremely simple to implement (at any derivative level), and is naturally amenable to parallelism. The gravity inversion problem is posed classically as a large and dense least […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org