5807

Posts

Sep, 30

Adding special-purpose processor support to the Erlang VM

This thesis investigates the possibility to extend the Erlang runtime system such that it can take advantage of special purpose compute units, such as GPUs and DSPs. Further more it investigates if certain parts of an Erlang system can be accelerated with help of these devices.
Sep, 29

Many-threaded implementation of differential evolution for the CUDA platform

Differential evolution is an efficient populational meta — heuristic optimization algorithm successful in solving difficult real world problems. Due to the simplicity of its operations and data structures, it is suitable for a parallel implementation on multicore systems and on the GPU. In this paper, we design a simple yet highly parallel implementation of the […]
Sep, 29

Active thread compaction for GPU path tracing

Modern GPUs like NVidia’s Fermi internally operate in a SIMD manner by ganging multiple (32) scalar threads together into SIMD warps; if a warp’s threads diverge, the warp serially executes both branches, temporarily disabling threads that are not on that path. In this paper, we explore and thoroughly analyze the concept of active thread compaction—i.e., […]
Sep, 29

Evolving CUDA PTX programs by quantum inspired linear genetic programming

The tremendous computing power of Graphics Processing Units (GPUs) can be used to accelerate the evolution process in Genetic Programming (GP). The automatic generation of code using the GPU usually follows two different approaches: compiling each evolved or interpreting multiple programs. Both approaches, however, have performance drawbacks. In this work, we propose a novel approach […]
Sep, 29

Large-Scale High-Lundquist Number Reduced MHD Simulations of the Solar Corona Using GPU Accelerated Machines

We have recently carried out a computational campaign to investigate a model of coronal heating in three-dimensions using reduced magnetohydrodynamics (RMHD). Our code is built on a conventional scheme using the pseudo-spectral method, and is parallelized using MPI. The current investigation requires very long time integrations using high Lundquist numbers, where the formation of very […]
Sep, 29

Possible planet-forming regions on submillimetre images

Submillimetre images of transition discs are expected to reflect the distribution of the optically thin dust. Former observation of three transition discs LkHa330, SR21N, and HD1353444B at submillimetre wavelengths revealed images which cannot be modelled by a simple axisymmetric disc. We show that a large-scale anticyclonic vortex that develops where the viscosity has a large […]
Sep, 28

Highly Scalable Multi Objective Test Suite Minimisation Using Graphics Cards

Despite claims of "embarrassing parallelism" for many optimisation algorithms, there has been very little work on exploiting parallelism as a route for SBSE scalability. This is an important oversight because scalability is so often a critical success factor for Software Engineering work. This paper shows how relatively inexpensive General Purpose computing on Graphical Processing Units […]
Sep, 28

Design space exploration towards a realtime and energy-aware GPGPU-based analysis of biosensor data

In this paper, novel objectives for the design space exploration of GPGPU applications are presented. The design space exploration takes the combination of energy efficiency and realtime requirements into account. This is completely different to the commonest high performance computing objective, which is to accelerate an application as much as possible. As a proof-of-concept, a […]
Sep, 28

Global optimization model on power efficiency of GPU and multicore processing element for SIMD computing with CUDA

Estimating and analyzing the power consuming features of a program on a hardware platform is important for energy aware High Performance Computing (HPC) optimization, it can help to handle critical design constraints at the level of software, chose preferable algorithm in order to reach the best energy performance. Optimizing the power efficiency of CUDA program […]
Sep, 28

2PARMA: Parallel Paradigms and Run-time Management Techniques for Many-Core Architectures

The 2PARMA project focuses on the development of parallel programming models and run-time resource management techniques to exploit the features of many-core processor architectures. The main goals of the 2PARMA project are: definition of a parallel programming model combining component-based and single-instruction multiple-thread approaches, instruction set virtualisation based on portable byte-code, run-time resource management policies […]
Sep, 28

Parallelizing fuzzy rule generation using GPGPU

This article proposes a method to parallelize the process of generating fuzzy if-then rules for pattern classification problems in order to reduce the computational time. The proposed method makes use of general purpose computation on graphics processing units (GPGPUs)’ parallel implementation with compute unified device architecture (CUDA), a development environment. CUDA contains a library to […]
Sep, 28

Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer

In this paper we present the programming of the Linpack benchmark on TianHe-1 system, the first petascale supercomputer system of China, and the largest GPU-accelerated heterogeneous system ever attempted before. A hybrid programming model consisting of MPI, OpenMP and streaming computing is described to explore the task parallel, thread parallel and data parallel of the […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: