high performance computing on graphics processing units: hgpu.org

Posts

May, 31

Geometric Optimisation using Karva for Graphical Processing Units

Population-based evolutionary algorithms continue to play an important role in artifically intelligent systems, but can not always easily use parallel computation. We have combined a geometric (any-space) particle swarm optimisation algorithm with use of Ferreira’s Karva language of gene expression programming to produce a hybrid that can accelerate the genetic operators and which can rapidly […]

CUDA

May, 31

GPU Multiple Sequence Alignment Fourier-Space Cross-Correlation Alignment

The aim of this project is to explore the possible application of Graphics Processors (GPUs) to accelerate and speed up sequence alignment by Fourier-space cross-correlation. Aligning signals using cross-correlations is a well studied approach in the world of signal processing, but has found relatively little reception in the realm of computational genomics. As long as […]

CUDA

May, 31

Improved Performance of CaFE and IRIS Model Fitting Using CUDA

Label-free optical bionsensors are known to be accurate and reliable tools for measuring and monitoring certain biomolecular interactions. In recent years, new techniques and technologies have emerged that enable high-throughput biosensing at lower system size, cost, and complexity. In particular, the LED-based Interferometric Reflectance Imaging Sensor (IRIS) has been demonstrated as a viable alternative to […]

CUDA

May, 31

Tiling optimizations for stencil computations

This thesis studies the techniques of tiling optimizations for stencil programs. Traditionally, research on tiling optimizations mainly focuses on tessellating tiling, atomic tiles and regular tile shapes. This thesis studies several novel tiling techniques which are out of the scope of traditional research. In order to represent a general tiling scheme uniformly, a unified tiling […]

May, 31

CUDA 5.5 Features and Release Candidate (RC) program (webinar)

The CUDA 5.5 RC is now available to CUDA Registered Developers on https://developer.nvidia.com/registered-developer-programs. Ujval Kapasi, NVIDIA CUDA Product Manager, will provide an overview of the new features of CUDA 5.5. CUDA Registered Developers already have access to the toolkit downloads – but we’ll provide brief instructions on how to download and install the binaries and […]

May, 30

Performance Tradeoff Spectrum of Integer and Floating Point Applications

Floating point precision and performance and the ratio of floating point units to integer processing elements on a graphics processing unit accelerator all continue to present complex tradeoffs for optimising core utilisation on modern devices. We investigate various hybrid CPU and GPU combinations using a range of different GPU models occupying different points in this […]

CUDA

May, 30

A Comparison of Statistical Techniques for Detecting Side-Channel Information Leakage in Cryptographic Devices

The development of a standardised testing methodology for side-channel resistance of cryptographic devices is an issue that has received recent focus from standardisation bodies such as NIST. Statistical techniques such as hypothesis and significance testing appear to be ideally suited for this purpose. In this work we evaluate the candidacy of three such tests: a […]

OpenCL

May, 30

Using GPU Simulation to Accurately Fit to the Power-Law Distribution

This article describes a methodology for fitting experimental data to the discrete power-law distribution and provides the results of a detailed simulation exercise used to calculate accurate cutoff values used to assess the fit to a power-law distribution when using the maximum likelihood estimation for the exponent of the distribution. Using massively parallel programming computing, […]

CUDA

May, 30

From Experiment to Design – Fault Characterization and Detection in Parallel Computer Systems Using Computational Accelerators

This dissertation summarizes experimental validation and co-design studies conducted to optimize the fault detection capabilities and overheads in hybrid computer systems (e.g., using CPUs and Graphics Processing Units, or GPUs), and consequently to improve the scalability of parallel computer systems using computational accelerators. The experimental validation studies were conducted to help us understand the failure […]

CUDA

May, 30

Parallelization of the Ant Colony Optimization for the Shortest Path Problem using OpenMP and CUDA

Finding efficient vehicle routes is an important logistics problem which has been studied for several decades. Metaheuristic algorithms offer some solutions for that problem. This paper deals with GPU implementation of the ant colony optimization algorithm (ACO), which can be used to find the best vehicle route between designated points. The algorithm is applied on […]

CUDA

May, 29

Job Parallelism using Graphical Processing Unit individual Multi-Processors and Highly Localised Memory

Graphical Processing Units(GPUs) are usually programmed to provide data-parallel acceleration to a host processor. Modern GPUs typically have an internal multi-processor (MP) structure that can be exploited in an unusual way to offer semi-independent task parallelism providing the MPs can operate within their own localised memory and apply data-parallelism to their own problem subset. We […]

CUDA

May, 29

On Migration and Consolidation of VMs in Hybrid CPU-GPU Environments

In this research, we target at the investigation of a dynamic energy-aware management framework on the execution of independent workloads (e.g., bag-of-tasks) in hybrid CPU-GPU PARA-computing platforms, aiming at optimizing the execution of workloads in appropriate computing resources concurrently while balancing the use of solely virtual or physical resources or hybridly selected resources, to achieve […]

CUDA

high performance computing on graphics processing units: hgpu.org

Posts

Geometric Optimisation using Karva for Graphical Processing Units

GPU Multiple Sequence Alignment Fourier-Space Cross-Correlation Alignment

Improved Performance of CaFE and IRIS Model Fitting Using CUDA

Tiling optimizations for stencil computations

CUDA 5.5 Features and Release Candidate (RC) program (webinar)

Performance Tradeoff Spectrum of Integer and Floating Point Applications

A Comparison of Statistical Techniques for Detecting Side-Channel Information Leakage in Cryptographic Devices

Using GPU Simulation to Accurately Fit to the Power-Law Distribution

From Experiment to Design – Fault Characterization and Detection in Parallel Computer Systems Using Computational Accelerators

Parallelization of the Ant Colony Optimization for the Shortest Path Problem using OpenMP and CUDA

Job Parallelism using Graphical Processing Unit individual Multi-Processors and Highly Localised Memory

On Migration and Consolidation of VMs in Hybrid CPU-GPU Environments

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)