high performance computing on graphics processing units: hgpu.org

Posts

May, 27

Scaling Radio Astronomy Signal Correlation on Heterogeneous Supercomputers Using Various Data Distribution Methodologies

Next generation radio telescopes will require orders of magnitude more computing power to provide a view of the universe with greater sensitivity. In the initial stages of the signal processing flow of a radio telescope, signal correlation is one of the largest challenges in terms of handling huge data throughput and intensive computations. We implemented […]

OpenCL

May, 26

GPU Accelerated XenDesktop for Designers and Engineers (webinar)

If you’ve ever wanted to virtualize your CAD or professional video graphics application and have the exact same local experience on a secure central platform, then this webinar provides insight on how to get there. Join Technology Evangelist, Thomas Poppelgaard, and learn how Citrix XenDesktop®, XenApp® and XenServer®, in combination with NVIDIA GRID VGX, makes […]

May, 26

Easily Accelerating Existing Monte Carlo Code: CVA and CCR Examples (webinar)

In this webinar, Hicham Lahlou, CEO & Co-founder, Xcelerit, will discuss real world applications of GPUs in risk management. He will show, using the Xcelerit SDK, how the complexity of GPU programming can be overcome allowing existing models and applications to be easily accelerated and extended, cutting software development and maintenance costs. The risks associated […]

May, 25

Enabling OS Research by Inferring Interactions in the Black-Box GPU Stack

General-purpose GPUs now account for substantial computing power on many platforms, but the management of GPU resources – cycles, memory, bandwidth – is frequently hidden in black-box libraries, drivers, and devices, outside the control of mainstream OS kernels. We believe that this situation is untenable, and that vendors will eventually expose sufficient information about cross-black-box […]

CUDA

May, 25

Use of Multi-GPU Systems for Larger Than Device FFTs: With Applications in Ultrasound Simulations

Ultrasound simulations are a type of application that are both computationally and communicatively intensive. With better performance, implementations of these can be used in designing new ultrasound probes, developing better signal processing techniques, training new ultrasonographers, in treatment planning and many other uses [11]. The pseudo-spectral technique can be used effectively to express the wave-propagation […]

CUDA

May, 25

GPU-accelerated protein family identification for metagenomics

The clustering of putative protein/Open Reading Frame (ORF) sequences available from large-scale metagenomics survey projects is a core analytical function that has led to the identification and characterization of novel protein families of environmental microbial communities. The implementation of this function, however, is currently challenged not only by data size but also by data complexity. […]

CUDA

May, 25

Stencil and Lattice Structures for Field Equation Model Simulations on GPUs

Field equations can be numerically simulated by approximating a continuous space field by a discrete lattice. There are a number of different lattice geometries that can be used to approximate continuous space which may cause structural artefacts in the simulation. These different lattice structures require the use of different stencil operators to approximate the spatial […]

CUDA

May, 25

A Tuned, Concurrent-Kernel Approach to Speed Up the APSP Problem

The All-Pair Shortest-Path (APSP) problem is a well-known problem in graph theory whose objective is to find the shortest paths between any pair of nodes. Computing the distances from one source node to the rest and repeating this process for every node of the graph is an adequate solution for sparse graphs. During the last […]

CUDA

May, 25

OpenCV – Accelerated Computer Vision using GPUs (webinar)

OpenCV (Open Source Computer Vision Library: http://opencv.willowgarage.com/wiki/) is an open-source BSD-licensed library that includes several hundreds of computer vision algorithms. In this webinar, learn how this powerful library has been accelerated using CUDA on NVIDIA GPUs. Find out more about OpenCV GPU computational capabilities at: http://docs.opencv.org/modules/gpu/doc/introduction.html Presented by Shalini Gupta, Senior Mobile Computer Vision Engineer, […]

May, 25

On-Demand Generating and Scheduling Optimised Parallel Applications on Heterogeneous Platforms

Scheduling applications tasks across heterogeneous clusters is a growing problem, particularly when new upgraded components are added to a parallel computing system that may have originally been homogeneous. We describe how automatic and just-in-time source code generation techniques can be used to make the best parallel decomposition for whatever resource is available in a heterogeneous […]

CUDA

May, 25

OpenCL Performance Evaluation on Modern Multi Core CPUs

Utilizing heterogeneous platforms for computation has become a general trend making the portability issue important. OpenCL (Open Computing Language) serves the purpose by enabling portable execution on heterogeneous architectures. However, unpredictable performance variation on different platforms has become a burden for programmers who write OpenCL programs. This is especially true for conventional multicore CPUs, since […]

OpenCL

May, 25

Quantifying the Energy Efficiency of FFT on Heterogeneous Platforms

Heterogeneous computing using Graphic Processing Units (GPUs) has become an attractive computing model given the available scale of data-parallel performance and programming standards such as OpenCL. However, given the energy issues present with GPUs, some devices can exhaust power budgets quickly. Better solutions are needed to effectively exploit the power efficiency available on heterogeneous systems. […]

OpenCL

high performance computing on graphics processing units: hgpu.org

Posts

Scaling Radio Astronomy Signal Correlation on Heterogeneous Supercomputers Using Various Data Distribution Methodologies

GPU Accelerated XenDesktop for Designers and Engineers (webinar)

Easily Accelerating Existing Monte Carlo Code: CVA and CCR Examples (webinar)

Enabling OS Research by Inferring Interactions in the Black-Box GPU Stack

Use of Multi-GPU Systems for Larger Than Device FFTs: With Applications in Ultrasound Simulations

GPU-accelerated protein family identification for metagenomics

Stencil and Lattice Structures for Field Equation Model Simulations on GPUs

A Tuned, Concurrent-Kernel Approach to Speed Up the APSP Problem

OpenCV – Accelerated Computer Vision using GPUs (webinar)

On-Demand Generating and Scheduling Optimised Parallel Applications on Heterogeneous Platforms

OpenCL Performance Evaluation on Modern Multi Core CPUs

Quantifying the Energy Efficiency of FFT on Heterogeneous Platforms

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)