high performance computing on graphics processing units: hgpu.org

Posts

May, 17

Using NVIDIA GPUs for Real-time Data Processing in a Holographic Radar System, webinar

In this webinar, Peter Wurmsdobler, Lead Software Architect, Aveillant, will give a short introduction to Aveillant’s Holographic Radar systems, the principles of Holographic radars, as opposed to scanning radar systems, as well as its computational requirements. Peter will go on to explore the technical challenges faced in the implementation of the mathematical algorithms needed, how […]

May, 16

The Next Steps for Folding@home, webinar

Folding@home is a large-scale volunteer distributed computing project, started in October 1, 2000. For over a decade, new types of hardware (such as GPUs, multi-core CPUs, and PS3) and algorithms have been pioneered in order to make significant advances in our ability to simulate diseases at the molecular scale. Join Professor Vijay Pande from Stanford […]

May, 16

An Introduction to CUDA Programming, webinar

Join Chris Mason, Product Manager, Acceleware, for an informative introduction to CUDA programming. The webinar will begin with a brief overview of CUDA and data-parallelism before focusing on the GPU programming model. Chris will explore the fundamentals of GPU kernels, host and device responsibilities, CUDA syntax and thread hierarchy. A programming demonstration of a simple […]

May, 16

C++ on GPUs Using OpenACC and the PGI Accelerator Compilers, webinar

The fastest supercomputers and clusters use a 64-bit host processor with one or more accelerators per node, most commonly GPUs. These compute accelerators exploit a high degree of parallelism to maximize performance and power efficiency. There are several challenges to effective and productive use of accelerators, the most important of which are managing data movement […]

May, 16

Using GPUs to Accelerate Orthorectification, Atmospheric Correction, and Transformations for Big Data, webinar

Significant improvements in speeds for imagery orthorectification, atmospheric correction, and image transformations like Independent Components Analysis (ICA) have been achieved using GPU-based implementations. Additional optimizations, when factored in with GPU processing capabilities, can provide 50x – 100x reduction in the time required to process large imagery. Exelis Visual Information Solutions (VIS) has implemented a CUDA-based […]

May, 16

Scaling Coupled Climate Models to Exascale: OpenACC-enabled ECEarth3 Earth System Model

Climate change due to increasing anthropogenic greenhouse gases and land surface change is currently one of the most relevant environmental concerns. It threatens ecosystems and human societies. However, its impact on the economy and our living standards depends largely on our ability to anticipate its effects and take appropriate action. Earth System Models (ESMs), such […]

CUDA

May, 16

Porting NAHUJ to CUDA

This white-paper reports on an enabling effort that involves porting a legacy 2D fluid dynamics Fortran code to NVIDIA GPUs. Given the complexity of both code and underlying (custom) numerical method, the natural choice was to use NVIDIA CUDA C to achieve the best possible performance. We achieved over 4.5x speed-up on a single K20 […]

CUDA

May, 16

Enabling CP2K Application for Exascale Computing with Accelerators using OpenACC and OpenCL

CP2K is an application for atomistic and molecular simulation and, with its excellent scalability, is particularly important with regards to use on future exascale systems. The code is well parallelized using MPI and hybrid MPI/OpenMP, typically scaling well to ~1 core per atom in the system. The research on CP2K done within PRACE-1IP stated that […]

CUDA

•

OpenCL

May, 16

Hybrid Use of OmpSs for a Shock Hydrodynamics Proxy Application

The LULESH proxy application models the behavior of the ALE3D multi-physics code with an explicit shock hydrodynamics problem, and is made in order to evaluate interactions between programming models and architectures, using a representative code significantly less complex than the application it models. As identified in the PRACE deliverable D7.2.1 [1], the OmpSs programming model […]

CUDA

May, 16

A Straightforward Preprocessing Approach for Accelerating Convex Hull Computations on the GPU

An effective strategy for accelerating the calculation of convex hulls for point sets is to filter the input points by discarding interior points. In this paper, we present such a straightforward and efficient preprocessing approach by exploiting the GPU. The basic idea behind our approach is to discard the points that locate inside a convex […]

CUDA

May, 15

Multi-GPGPU Cellular Automata Simulations using OpenACC

The Frisch-Hasslacher-Pomeau (FHP) model is a lattice gas cellular automaton designed to simulate fluid flows using the exact, purely Boolean arithmetic, without any round-off error. Here we investigate the problem of its efficient porting to clusters of Fermi-class graphic processing units. To this end two multi-GPU implementations were developed and examined: one using the NVIDIA […]

CUDA

May, 15

Real-time Image Processing on Low Cost Embedded Computers

In 2012 a federal mandate was imposed that required the FAA to integrate unmanned aerial systems (UAS) into the national airspace (NAS) by 2015 for civilian and commercial use. A significant driver for the increasing popularity of these systems is the rise in open hardware and open software solutions which allow hobbyists to build small […]

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Using NVIDIA GPUs for Real-time Data Processing in a Holographic Radar System, webinar

The Next Steps for Folding@home, webinar

An Introduction to CUDA Programming, webinar

C++ on GPUs Using OpenACC and the PGI Accelerator Compilers, webinar

Using GPUs to Accelerate Orthorectification, Atmospheric Correction, and Transformations for Big Data, webinar

Scaling Coupled Climate Models to Exascale: OpenACC-enabled ECEarth3 Earth System Model

Porting NAHUJ to CUDA

Enabling CP2K Application for Exascale Computing with Accelerators using OpenACC and OpenCL

Hybrid Use of OmpSs for a Shock Hydrodynamics Proxy Application

A Straightforward Preprocessing Approach for Accelerating Convex Hull Computations on the GPU

Multi-GPGPU Cellular Automata Simulations using OpenACC

Real-time Image Processing on Low Cost Embedded Computers

Recent source codes

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

PELSI: Power-Efficient Layer-Switched Inference

Ouroboros: Virtualized Queues for dynamic memory management

MSCCL++: A GPU-driven communication stack for scalable AI applications

Benchmark compute shader of Unity against InteropUnityCUDA

Most viewed papers (last 30 days)