24994

Posts

May, 16

NPBench: A Benchmarking Suite for High-Performance NumPy

Python, already one of the most popular languages for scientific computing, has made significant inroads in High Performance Computing (HPC). At the center of Python’s ecosystem is NumPy, an efficient implementation of the multi-dimensional array (tensor) structure, together with basic arithmetic and linear algebra. Compared to traditional HPC languages, the relatively low performance of Python […]
May, 16

Performance Assessment of using OpenCL on FPGA Systems for ODE Solvers

Parameter optimization is a common task in various fields such as computational biology. In these scientific fields, optimization can be, e.g. based on ordinary differential equations with the computational task getting increasingly computation-intensive for increasing complexity of ODE and the parameters to determine. Hence, this raises requirements for an efficient treatment on high-performance computing architectures. […]
May, 16

Winograd Algorithm for AdderNet

Adder neural network (AdderNet) is a new kind of deep model that replaces the original massive multiplications in convolutions by additions while preserving the high performance. Since the hardware complexity of additions is much lower than that of multiplications, the overall energy consumption is thus reduced significantly. To further optimize the hardware overhead of using […]
May, 16

Raster Time Series: Learning and Processing

As the amount of remote sensing data is increasing at a high rate, due to great improvements in sensor technology, efficient processing capabilities are of utmost importance. Remote sensing data from satellites is crucial in many scientific domains, like biodiversity and climate research. Because weather and climate are of particular interest for almost all living […]
May, 16

PeriPy – A High Performance OpenCL Peridynamics Package

This paper presents a lightweight, open-source and high-performance python package for solving peridynamics problems in solid mechanics. The development of this solver is motivated by the need for fast analysis tools to achieve the large number of simulations required for `outer-loop’ applications, including sensitivity analysis, uncertainty quantification and optimisation. Our python software toolbox utilises the […]
May, 9

Performance Evaluation and Improvements of the PoCL Open-Source OpenCL Implementation on Intel CPUs

The Portable Computing Language (PoCL) is a vendor independent open-source OpenCL implementation that aims to support a variety of compute devices in a single platform. Evaluating PoCL versus the Intel OpenCL implementation reveals significant performance drawbacks of PoCL on Intel CPUs – which run 92 % of the TOP500 list. Using a selection of benchmarks, […]
May, 9

Sylkan: Towards a Vulkan Compute Target Platform for SYCL

SYCL is a modern high-level C++ programming interface which excels at expressing data parallelism for heterogeneous hardware platforms in a programmer-friendly way, and is standardized by the Khronos Group. The latest version of the standard, SYCL 2020, removes the previous dependence of the specification and its implementations on an underlying OpenCL target, opening the door […]
May, 9

A fluid simulation system based on the MPS method

Fluid flow simulation is a highly active area with applications in a wide range of engineering problems and interactive systems. Meshless methods like the Moving Particle Semi-implicit (MPS) are a great alternative to deal efficiently with large deformations and free-surface flow. However, mesh-based approaches can achieve higher numerical precision than particle-based techniques with a performance […]
May, 9

Irregularity Mitigation and Portability Abstractions for Accelerated Sparse Matrix Factorization

In this thesis, we investigate new ways to mitigate the inherent irregularity in sparse matrix factorizations and decompose the resulting computation into simple kernels which are portable across a diverse set of compute accelerator architectures through our novel compiler borG. Be it weather prediction, climate models, personalized medicine, genetic analysis and autonomous driving: some of […]
May, 9

Efficacy of Images Versus Data Buffers: Optimizing Interactive Applications Utilizing OpenCL for Scientific Visualization

This paper examines an algorithm using dual OpenCL image buffers to optimize data streaming for ensemble processing and visualization. Image buffers were utilized because they allow cached memory access, unlike simple data buffers, which are more commonly used. OpenCL image object performance was improved by allowing upload and mapping into one buffer to occur concurrently […]
May, 2

DeepfakeUCL: Deepfake Detection via Unsupervised Contrastive Learning

Face deepfake detection has seen impressive results recently. Nearly all existing deep learning techniques for face deepfake detection are fully supervised and require labels during training. In this paper, we design a novel deepfake detection method via unsupervised contrastive learning. We first generate two different transformed versions of an image and feed them into two […]
May, 2

Enabling Energy-Efficient DNN Training on Hybrid GPU-FPGA Accelerators

DNN training consumes orders of magnitude more energy than inference and requires innovative use of accelerators to improve energy-efficiency. However, despite having complementary features, GPUs and FPGAs have been mostly used independently for the entire training process, thus neglecting the opportunity in assigning individual but distinct operations to the most suitable hardware. In this paper, […]

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: