6440

Posts

Nov, 23

Real-World Constraints of GPUs in Real-Time Systems

Graphics processing units (GPUs) are becoming increasingly important in today’s platforms as their increased generality allows for them to be used as powerful coprocessors. In this paper, we explore possible applications for GPUs in real-time systems, discuss the limitations and constraints imposed by current GPU technology, and present a summary of our research addressing many […]
Nov, 23

Soren: Adaptive MapReduce for Programmable GPUs

In recent years the MapReduce programming model has been widely used for developing parallel data-intensive applications. As a result of its popularity, there exist many implementations of the MapReduce model on different parallel architectures including on massively parallel programmable GPUs. A basic challenge in implementing a MapReduce runtime system is the wide diversity of applications […]
Nov, 23

Towards solving the Table Maker’s Dilemma on GPU

Since 1985, the IEEE 754 standard defines formats, rounding modes and basic operations for floating-point arithmetic. In 2008 the standard has been extended, and recommendations have been added about the rounding of some elementary functions such as trigonometric functions (cosine, sine, tangent and their inverses), exponentials, and logarithms. However to guarantee the exact rounding of […]
Nov, 23

Accelerating Protein Sequence Search in a Heterogeneous Computing System

The "Basic Local Alignment Search Tool” (BLAST) is arguably the most widely used computational tool in bioinformatics. However, the computational power required for routine BLAST analysis has been outstripping Moore’s Law due to the exponential growth in the size of the genomic sequence databases that BLAST searches on. To address the above issue, we propose […]
Nov, 23

Building-Blocks for Performance Oriented DSLs

Domain-specific languages raise the level of abstraction in software development. While it is evident that programmers can more easily reason about very high-level programs, the same holds for compilers only if the compiler has an accurate model of the application domain and the underlying target platform. Since mapping high-level, general-purpose languages to modern, heterogeneous hardware […]
Nov, 23

TEG: GPU Performance Estimation Using a Timing Model

Modern Graphic Processing Units (GPUs) offer significant performance speedup over conventional processors. Programming on GPU for general purpose applications has become an important research area. CUDA programming model provides a C-like interface and is widely accepted. However, since hardware vendors do not disclose enough underlying architecture details, programmers have to optimize their applications without fully […]
Nov, 23

Accelerating the Rate of Astronomical Discovery with GPU-Powered Clusters

In recent years, the Graphics Processing Unit (GPU) has emerged as a low-cost alternative for high performance computing, enabling impressive speed-ups for a range of scientific computing applications. Early adopters in astronomy are already benefiting in adapting their codes to take advantage of the GPU’s massively parallel processing paradigm. I give an introduction to, and […]
Nov, 23

An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm

Recently, a fully implicit, energy- and charge-conserving particle-in-cell method has been proposed for multi-scale, full-f kinetic simulations [G. Chen, et al., J. Comput. Phys. 230,18 (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver, capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the […]
Nov, 22

Dynamic adaptation and distribution of binaries to heterogeneous architectures

Real time multimedia workloads require progressingly more processing power. Modern many-core architectures provide enough processing power to satisfy the requirements of many real time multimedia workloads. When even they are unable to satisfy processing power requirements, network-distribution can provide many workloads with even more computing power. In this thesis, we present solutions that can be […]
Nov, 22

Efficient Shallow Water Simulations on GPUs

For some classes of problems, NVIDIA CUDA abstraction and hardware properties combine with problem characteristics to limit the specific problem instances that can be effectively accelerated. As a real-world example, a twodimensional correlation-based template-matching MATLAB application is considered. While this problem has a well known solution for the common case of linear image filtering-small fixed […]
Nov, 22

Dynamic Heterogeneous Scheduling Decisions Using Historical Runtime Data

Heterogeneous systems often employ processing units with a wide spectrum of performance capabilities. Allowing individual applications to make greedy local scheduling decisions leads to imbalance, with underutilization of some devices and excessive contention for others. If we instead allow the system to make global scheduling decisions and assign some applications to a slower device, we […]
Nov, 22

Application of GPGPU for Acceleration of Short DNA Sequence Alignment in Unipro UGENE Project

A dramatic increase of available sequencing datasets has resulted in the need of fast sequence alignment methods. Plenty of novel methods were proposed to perform the fast alignment of NGS data and some of them appeared to be rather effective, however a relatively small number of existing alignment tools use Graphic Processing Units (GPUs) to […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: