10693

Posts

Oct, 8

Stressing the BER simulation of LDPC codes in the error floor region using GPU clusters

Low-Density Parity-Check (LDPC) codes are known for having excellent Bit Error Rate (BER) performance, even in the presence of quite low Signal-to-Noise Ratios (SNR). But the development of this type of error-correcting codes poses severe challenges since the design of new codes is based on heuristics such as girth and sparsity that not always provide […]
Oct, 8

Parallel and Distributed Implementations of Multiple and Two-Dimensional Pattern Matching Algorithms

String matching is a fundamental problem in the area of scientific computing. When two different one-dimensional strings are taken as an input, the so called "input string" and the so called "pattern", the string matching problem involves the location of all the positions in the input string where the pattern appears. As there has been […]
Oct, 8

libcloudph++ 0.1: single-moment bulk, double-moment bulk, and particle-based warm-rain microphysics library in C++

This paper introduces a library of algorithms for representing cloud microphysics in numerical models written in C++, hence the name libcloudph++. In the initial release, the library covers three warm-rain schemes: the single- and double-moment bulk schemes, and the particle-based scheme with Monte-Carlo coalescence. The three schemes are intended for modelling frameworks of different dimensionality […]
Oct, 8

Porting Large HPC Applications to GPU Clusters: The Codes GENE and VERTEX

We have developed GPU versions for two major high-performance-computing (HPC) applications originating from two different scientific domains. GENE is a plasma microturbulence code which is employed for simulations of nuclear fusion plasmas. VERTEX is a neutrino-radiation hydrodynamics code for "first principles"-simulations of core-collapse supernova explosions. The codes are considered state of the art in their […]
Oct, 7

Advanced 2D Rasterization on Modern CPUs

The graphics processing unit (GPU) has become part of our everyday life through desktop computers and portable devices (tablets, mobile phones, etc.). Because of the dedicated hardware visualization has been significantly accelerated and today’s software uses only the GPU for rasterization. Besides the graphical devices, the central processing unit (CPU) has also made remarkable progress. […]
Oct, 7

Performance evaluation of CUDA programming for machining simulation

5-axis milling simulations in CAM software are mainly used to detect collisions between the tool and the part. They are very limited in terms of surface topography investigations to validate machining strategies as well as machining parameters such as chordal deviation, scallop height and tool feed. Z-buffer or N-Buffer machining simulations provide more precise simulations […]
Oct, 7

GPU Accelerated Conjunction Assessment with Applications to Formation Flight and Space Debris Tracking

The primary purpose of conjunction assessment (CA) is to prevent the collision of objects in space. Typical collision scenarios involve satellites with space debris or a formation of satellites with each other. Users performing orbit propagation and CA on very large scales must judiciously moderate force model fidelity and/or acutely limit the number of objects […]
Oct, 7

Vectorized OpenCL implementation of numerical integration for higher order finite elements

In our work we analyze computational aspects of the problem of numerical integration in finite element calculations and consider an OpenCL implementation of related algorithms for processors with wide vector registers. As a platform for testing the implementation we choose the PowerXCell processor, being an example of the Cell Broadband Engine (CellBE) architecture. Although the […]
Oct, 7

Numerical integration on GPUs for higher order finite elements

The paper considers the problem of implementation on graphics processors of numerical integration routines for higher order finite element approximations. The design of suitable GPU kernels is investigated in the context of general purpose integration procedures, as well as particular example applications. The most important characteristic of the problem investigated is the large variation of […]
Oct, 5

Measurements of performance of hardware and general purpose classical molecular dynamics simulation software

This note presents different measurements of hardware and software performance in classical molecular dynamics (CMD) simulations from 2001 through 2010 obtained from published literature and the internet. Opinion articles by CMD researchers point out that tools developed during that decade to set-up CMD simulations barely increased human productivity. Massively parallel hardware and CMD software running […]
Oct, 5

Speculative Execution of Parallel Programs with Precise Exception Semantics on GPUs

General purpose computing on GPUs (GPGPU) can enable significant performance and energy improvements for certain classes of applications. However, current GPGPU programming models, such as CUDA and OpenCL, are only accessible by systems experts through low-level C/C++ APIs. In contrast, large numbers of programmers use high-level languages, such as Java, due to their productivity advantages […]
Oct, 5

Parametric GPU Code Generation for Affine Loop Programs

Partitioning a parallel computation into finitely sized chunks for effective mapping onto a parallel machine is a critical concern for source-to-source compilation. In the context of OpenCL and CUDA, this translates to the definition of a uniform hyper-rectangular partitioning of the parallel execution space where each partition is subject to a fine-grained distribution of resources […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org