19253

Posts

Dec, 29

Porting tree-based hash table compression to GPGPU model checking

To reduce the costs of faulty software, methods to improve software quality are very popular nowadays. One of these methods is model checking: verifying the functional correctness of the model of a hardware or software system. The model implies a state space, which consists of all possible states of the system and all possible transitions […]
Dec, 29

Automatic Performance Optimisation of Parallel Programs for GPUs via Rewrite Rules

Graphics Processing Units (GPUs) are now commonplace in computing systems and are the most successful parallel accelerators. Their performance is orders of magnitude higher than traditional Central Processing Units (CPUs) making them attractive for many application domains with high computational demands. However, achieving their full performance potential is extremely hard, even for experienced programmers, as […]
Dec, 29

Accelerating Molecular Docking by Parallelized Heterogeneous Computing – A Case Study of Performance, Quality of Results, and Energy-Efficiency using CPUs, GPUs, and FPGAs

Molecular Docking (MD) is a key tool in computer-aided drug design that aims to predict the binding pose between a small molecule and a macromolecular target. At its core, MD calculates the strength of possible binding poses, and searches for the energetically-stronger ones among those generated during simulation. Automatic Docking (AutoDock) is a widely-used MD […]
Dec, 29

Atmospheric turbulence removal using convolutional neural network

This paper describes a novel deep learning-based method for mitigating the effects of atmospheric distortion. We have built an end-to-end supervised convolutional neural network (CNN) to reconstruct turbulence-corrupted video sequence. Our framework has been developed on the residual learning concept, where the spatio-temporal distortions are learnt and predicted. Our experiments demonstrate that the proposed method […]
Dec, 15

JAX, M.D.: End-to-End Differentiable, Hardware Accelerated, Molecular Dynamics in Pure Python

A large fraction of computational science involves simulating the dynamics of particles that interact via pairwise or many-body interactions. These simulations, called Molecular Dynamics (MD), span a vast range of subjects from physics and materials science to biochemistry and drug discovery. Most MD software involves significant use of handwritten derivatives and code reuse across C++, […]
Dec, 15

libmolgrid: GPU Accelerated Molecular Gridding for Deep Learning Applications

There are many ways to represent a molecule as input to a machine learning model and each is associated with loss and retention of certain kinds of information. In the interest of preserving three-dimensional spatial information, including bond angles and torsions, we have developed libmolgrid, a general-purpose library for representing three-dimensional molecules using multidimensional arrays. […]
Dec, 15

Array Languages Make Neural Networks Fast

Modern machine learning frameworks are complex: they are typically organised in multiple layers each of which is written in a different language and they depend on a number of external libraries, but at their core they mainly consist of tensor operations. As array-oriented languages provide perfect abstractions to implement tensor operations, we consider a minimalistic […]
Dec, 8

The BondMachine toolkit: Enabling Machine Learning on FPGA

The BondMachine (BM) is an innovative prototype software ecosystem aimed at creating facilities where both hardware and software are co-designed, guaranteeing a full exploitation of fabric capabilities (both in terms of concurrency and heterogeneity) with the smallest possible power dissipation. In the present paper we will provide a technical overview of the key aspects of […]
Dec, 8

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it provides an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing […]
Dec, 8

Masivo: Parallel Simulation Model Based on OpenCL for Massive Public Transportation Systems’ Routes

There is a large number of tools for the simulation of traffic and routes in public transport systems. These use different simulation models (macroscopic, microscopic, and mesoscopic). Unfortunately, these simulation tools are limited when simulating a complete public transport system, which includes all its buses and routes (up to 270 for the London Underground). The […]
Dec, 8

GGNN: Graph-based GPU Nearest Neighbor Search

Approximate nearest neighbor (ANN) search in high dimensions is an integral part of several computer vision systems and gains importance in deep learning with explicit memory representations. Since PQT and FAISS started to leverage the massive parallelism offered by GPUs, GPU-based implementations are a crucial resource for today’s state-of-the-art ANN methods. While most of these […]
Dec, 8

GPU Computing with Python: Performance, Energy Efficiency and Usability

In this work, we examine the performance, energy efficiency and usability when using Python for developing HPC codes running on the GPU. We investigate the portability of performance and energy efficiency between CUDA and OpenCL; between GPU generations; and between low-end, mid-range and high-end GPUs. Our findings show that the impact of using Python is […]

Recent source codes

* * *

* * *

HGPU group © 2010-2020 hgpu.org

All rights belong to the respective authors

Contact us: