9549

Posts

May, 25

GEMTC: GPU Enabled Many-Task Computing

Current software and hardware limitations prevent Many-Task Computing (MTC) workloads from leveraging hardware accelerators (NVIDIA GPUs, Intel Xeon Phi) boasting Many-Core Computing architectures. Some broad application classes that fit the MTC paradigm are workflows, MapReduce, high-throughput computing, and a subset of high-performance computing. MTC emphasizes using many computing resources over short periods of time to […]
May, 25

JPEG-GPU:: a GPGPU Implementation of JPEG Core Coding Systems

JPEG is a commonly used method of lossy compression for digital photography (image). This work targets on accelerating JPEG’s compressor and decompressor with GPU. Though the final results are not promising, I would like to introduce the lessons I have learned in accelerating a general system with GPGPU.
May, 23

Graphics Processing Unit (GPU) Implementation Methodology of AERMOD Model

Air pollution is one of the major problems the world is facing today. Air pollution is caused due to release of dangerous chemical substances such as carbon monoxide, CFC (Chlorofluorocarbon), carbon dioxide, hydro carbon, sulfur dioxide, etc. in to the atmosphere. These substances are produced by various anthropological activities such as usage of vehicles, factory […]
May, 23

Composing multiple StarPU applications over heterogeneous machines: a supervised approach

Enabling HPC applications to perform efficiently when invoking multiple parallel libraries simultaneously is a great challenge. Even if a single runtime system is used underneath, scheduling tasks or threads coming from different libraries over the same set of hardware resources introduces many issues, such as resource oversubscription, undesirable cache flushes or memory bus contention. This […]
May, 23

Sequential Consistency for Heterogeneous-Race-Free: Programmer-centric Memory Models for Heterogeneous Platforms

Hardware vendors now provide heterogeneous platforms in commodity markets (e.g., integrated CPUs and GPUs), and are promising an integrated, shared memory address space for such platforms in future iterations. Because not all threads in a heterogeneous platform can communicate with the same latency, vendors are proposing synchronization mechanisms that allow threads to communicate with a […]
May, 23

Surface Reconstruction from Scattered Point via RBF Interpolation on GPU

In this paper we describe a parallel implicit method based on radial basis functions (RBF) for surface reconstruction. The applicability of RBF methods is hindered by its computational demand, that requires the solution of linear systems of size equal to the number of data points. Our reconstruction implementation relies on parallel scientific libraries and is […]
May, 23

GPU Enhancement of the Trigger to Extend Physics Reach at the LHC

Significant new challenges are continuously confronting the High Energy Physics (HEP) experiments, in particular the two detectors at the Large Hadron Collider (LHC) at CERN, where nominal conditions deliver proton-proton collisions to the detectors at a rate of 40 MHz. This rate must be significantly reduced to comply with both the performance limitations of the […]
May, 21

Evaluating the Performance of Legacy Applications on Emerging Parallel Architectures

The gap between a supercomputer’s theoretical maximum ("peak") floating-point performance and that actually achieved by applications has grown wider over time. Today, a typical scientific application achieves only 5-20% of any given machine’s peak processing capability, and this gap leaves room for significant improvements in execution times. This problem is most pronounced for modern "accelerator" […]
May, 21

Implementing Continuous Integration Software in an Established Computational Chemistry Software Package

Continuous integration is the software engineering principle of rapid and automated development and testing. We identify several key points of continuous integration and demonstrate how they relate to the needs of computational science projects by discussing the implementation and relevance of these principles to AMBER, a large and widely used molecular dynamics software package. The […]
May, 21

An Investigation of the Performance Portability of OpenCL

This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level benchmark from the NAS Parallel Benchmark Suite. An account of the design decisions addressed during the development of this code is presented, demonstrating the importance of memory arrangement and work-item/work-group distribution strategies when applications are deployed on different device types. The […]
May, 21

Super Earths and Dynamical Stability of Planetary Systems: First Parallel GPU Simulations Using GENGA

We report on the stability of hypothetical Super-Earths in the habitable zone of known multi-planetary systems. Most of them have not yet been studied in detail concerning the existence of additional low-mass planets. The new N-body code GENGA developed at the UZH allows us to perform numerous N-body simulations in parallel on GPUs. With this […]
May, 21

3DES ECB Optimized for Massively Parallel CUDA GPU Architecture

Modern computers have graphics cards with much higher theoretical efficiency than conventional CPU. The paper presents application possibilities GPU CUDA acceleration for encryption of data using the new architecture tailored to the 3DES algorithm, characterized by increased security compared to the normal DES. The algorithm used in ECB mode (Electronic Codebook), in which 64-bit data […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org