high performance computing on graphics processing units: hgpu.org

Posts

Feb, 5

Accelerating Outlier Detection with Uncertain Data using Graphics Processors

Outlier detection (also known as anomaly detection) is a common data mining task in which data points that lie outside expected patterns in a given dataset are identified. This is useful in areas such as fault detection, intrusion detection and in pre-processing before further analysis. There are many approaches already in use for outlier detection, […]

OpenCL

Jan, 31

Decompilation of LLVM IR

Recently, in many important domains, high-level languages have become the code representations with widest platform support surpassing any low-level language in their area with respect to completeness and importance as exchange format (e.g. OpenCL for data-parallel computing, GLSL/HLSL for shader programs, JavaScript for the web). The code representations of many actively-developed compiler frameworks [JVM,LLVM,FIRM] are […]

OpenCL

Jan, 30

Performance Evaluation of Query Processing Algorithms on GPGPUs

Modern Graphical Processing Units (GPUs) can perform general purpose computing, next to standard graphical processing. Open frameworks, such as the OpenCL standard by the Khronos Group, enable developers to easily harness the computational power of GPUs. While in certain aspects, these are more powerful than standard CPUs, the latter are still a more suitable solution […]

OpenCL

Jan, 30

Towards a Tunable Multi-Backend Skeleton Programming Framework for Multi-GPU Systems

SkePU is a C++ template library that provides a simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA and OpenCL. The interface is also general enough to support other architectures, and SkePU implements both a sequential CPU and a parallel OpenMP backend. It also supports multi-GPU systems. […]

CUDA

•

OpenCL

Jan, 26

Visual Data Mining Using the Point Distribution Tensor

We explore a novel algorithm to analyze arbitrary distributions of 3D-points. Using a direct tensor field visualization technique allows to easily identify regions of linear, planar or isotropic structure. This approach is very suitable for visual data mining and exemplified upon geoscience applications. It allows to distinguish, for example, power lines and flat terrains in […]

OpenCL

Jan, 25

Realtime scheduling using GPUs – proof of feasibility

This paper will report our evaluation to use openCL as a platform for hard realtime scheduling. Specifically, we have evaluated which types of tasks are faster on GPGPU than on CPU. We have investigated computational tasks, memory intensive tasks (especially tasks using low latency GDDR memory) and disk intensive tasks. This study is the first […]

OpenCL

Jan, 21

Markov Chain Monte Carlo on the GPU

Markov chains are a useful tool in statistics that allow us to sample and model a large population of individuals. We can extend this idea to the challenge of sampling solutions to problems. Using Markov chain Monte Carlo (MCMC) techniques we can also attempt to approximate the number of solutions with a certain confidence based […]

OpenCL

Jan, 18

Evaluation and enhancement of memory efficiency targeting general-purpose computations on scalable data-parallel GPU architectures

This thesis addresses the memory efficiency of general-purpose applications running on massively multi-threaded, data-parallel GPU architectures. Although scalable, data-parallel GPU architectures and their associated general-purpose programming models offer impressive computational capability and attractive power budgets, the pace of migrating general-purpose applications to this emerging class of architectures is significantly hindered by the efficiency of memory […]

OpenCL

Jan, 18

RNS-Based Elliptic Curve Point Multiplication for Massive Parallel Architectures

Acceleration of cryptographic applications on massive parallel computing platforms, such as Graphic Processing Units (GPUs), becomes a real challenge concerning practical implementations. In this paper, we propose a parallel algorithm for Elliptic Curve (EC) point multiplication in order to compute EC cryptography on these platforms. The proposed approach relies on the usage of the Residue […]

OpenCL

Jan, 15

Declarative Parallel Programming for GPUs

The recent rise in the popularity of Graphics Processing Units (GPUs) has been fueled by software frameworks, such as NVIDIA’s Compute Unified Device Architecture (CUDA) and Khronos Group’s OpenCL that make GPUs available for general purpose computing. However, CUDA and OpenCL are still lowlevel approaches that require users to handle details about data layout and […]

OpenCL

Jan, 9

Providing Source Code Level Portability Between CPU and GPU with MapCG

Graphics processing units (GPU) have taken an important role in the general purpose computing market in recent years. At present, the common approach to programming GPU units is to write GPU specific code with low level GPU APIs such as CUDA. Although this approach can achieve good performance, it creates serious portability issues as programmers […]

CUDA

•

OpenCL

Jan, 3

GPGPU Accelerated Texture-Based Radiosity

Radiosity is a popular global illumination algorithm capable of achieving photorealistic rendering results. However, its use in interactive environments is limited by its computational complexity. This paper presents a GPGPU-based implementation of the gathering radiosity approach using texture-based discretisation and the OpenCL framework. Hemicubes are rendered to a texture array and processed by OpenCL kernels […]

OpenCL

•

OpenGL

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Accelerating Outlier Detection with Uncertain Data using Graphics Processors

Decompilation of LLVM IR

Performance Evaluation of Query Processing Algorithms on GPGPUs

Towards a Tunable Multi-Backend Skeleton Programming Framework for Multi-GPU Systems

Visual Data Mining Using the Point Distribution Tensor

Realtime scheduling using GPUs – proof of feasibility

Markov Chain Monte Carlo on the GPU

Evaluation and enhancement of memory efficiency targeting general-purpose computations on scalable data-parallel GPU architectures

RNS-Based Elliptic Curve Point Multiplication for Massive Parallel Architectures

Declarative Parallel Programming for GPUs

Providing Source Code Level Portability Between CPU and GPU with MapCG

GPGPU Accelerated Texture-Based Radiosity

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)