6082

Posts

Oct, 20

Experimental B+-tree for GPU

The main intention of this work is to create a dictionary structure which could benefit from massive parallelism of threads when performing computation on all or a selected set of elements, while having an ability to search for and insert keys very quickly, yet preserving the order of elements. So far, no such structure dedicated […]
Oct, 20

PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems

PEPPHER, a three-year European FP7 project, addresses efficient utilization of hybrid (heterogeneous) computer systems consisting of multicore CPUs with GPU-type accelerators. This article outlines the PEPPHER performance-aware component model, performance prediction means, runtime system, and other aspects of the project. A larger example demonstrates performance portability with the PEPPHER approach across hybrid systems with one […]
Oct, 20

FAMOUS, faster: using parallel computing techniques to accelerate the FAMOUS/HadCM3 climate model with a focus on the radiative transfer algorithm

We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to C and restructuring the algorithm around the computation of a single air column. A task queue and a thread pool are used to distribute the computation to several processors. Finally, four […]
Oct, 20

Explicit platform descriptions for heterogeneous many-core architectures

Heterogeneous many-core architectures offer a way to cope with energy consumption limitations of various computing systems from small mobile devices to large data-centers. However, programmers typically must consider a large diversity of architectural information to develop efficient software. In this paper we present our ongoing work towards a Platform Description Language (PDL) that enables to […]
Oct, 20

On the Use of an Algebraic Language Interface for Waveform Definition

We discuss implementation aspects of a software-defined radio system that allows the user to define waveforms using an algebraic language interface, currently as an extension to C++. Current software-defined radio systems provide waveform definitions through a combination of a graphical interface, markup language, interpreted script, and compiled code. No matter which methods are used, the […]
Oct, 20

A prototyping environment for high performance reconfigurable computing

In the face of power wall and high performance requirements, designers of hardware architectures are directed more and more towards reconfigurable computing with the usage of heterogeneous CPU/FPGA systems. In such architectures, multi-core processors come with high computation rates while the reconfigurable logic offers high performance per watt and adaptability to the application constraints. However, […]
Oct, 19

An Efficient Stream Buffer Mechanism for Dataflow Execution on Heterogeneous Platforms with GPUs

The move towards heterogeneous parallel computing is underway as witnessed by the emergence of novel computing platforms combining architecturally diverse components such as CPUs, GPUs and special function units. We approach mapping of streaming applications onto heterogeneous architectures using a Process Network (PN) model of computation. In this paper, we present an approach for exploiting […]
Oct, 19

The Potential for a GPU-Like Overlay Architecture for FPGAs

We propose a soft processor programming model and architecture inspired by graphics processing units (GPUs) that are well-matched to the strengths of FPGAs, namely, highly parallel and pipelinable computation. In particular, our soft processor architecture exploits multithreading, vector operations, and predication to supply a floating-point pipeline of 64 stages via hardware support for up to […]
Oct, 19

Designing the Language Liszt for Building Portable Mesh-based PDE Solvers

Complex physical simulations have driven the need for exascale computing, but reaching exascale will require more power-efficient supercomputers. Heterogenous hardware offers one way to increase efficiency, but is difficult to program and lacks a unifying programming model. Abstracting problems at the level of the domain rather than hardware offers an alternative approach. In this paper […]
Oct, 19

10×10: A General-purpose Architectural Approach to Heterogeneity and Energy Efficiency

Two decades of microprocessor architecture driven by quantitative 90/10 optimization has delivered an extraordinary 1000-fold improvement in microprocessor performance, enabled by transistor scaling which improved density, speed, and energy. Recent generations of technology have produced limited benefits in transistor speed and power, so as a result the industry has turned to multicore parallelism for performance […]
Oct, 19

Heterogeneous Accelerated Bioinformatics-Perspectives for Cancer Research

The demand for even higher performance in bioinformatics data analysis continues to grow rapidly as the volumes of data generated by next generation sequencing equipment soar. Traditional acceleration techniques historically used for faster bioinformatics application will individually be insufficient to meet the demand and increased analysis complexity, requiring an integrated heterogeneous accelerated computing environment. Current […]
Oct, 19

A Code Transformation Framework for Scientific Applications on Structured Grids

The combination of expert-tuned code expression and aggressive compiler optimizations is known to deliver the best achievable performance for modern multicore processors. The development and maintenance of these optimized code expressions is never trivial. Tedious and error-prone processes greatly decrease the code developer’s willingness to adopt manually-tuned optimizations. In this paper, we describe a pre-compilation […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: