high performance computing on graphics processing units: hgpu.org

Posts

Feb, 15

Graph Coarsening and Clustering on the GPU

Agglomerative clustering is an effective greedy way to quickly generate graph clusterings of high modularity in a small amount of time. In an effort to use the power offered by multi-core CPU and GPU hardware to solve the clustering problem, we introduce a fine-grained sharedmemory parallel graph coarsening algorithm and use this to implement a […]

CUDA

Feb, 14

Robust Real-Time Multiprocessor Interrupt Handling Motivated by GPUs

Architectures in which multicore chips are augmented with graphics processing units (GPUs) have great potential in many domains in which computationally intensive real-time workloads must be supported. However, unlike standard CPUs, GPUs are treated as I/O devices and require the use of interrupts to facilitate communication with CPUs. Given their disruptive nature, interrupts must be […]

CUDA

Feb, 14

Spatial interpolation of scattered geoscientific data

Most data for environmental variables (e. g. meteorological variables, soil properties etc.) are collected from point sources. For modeling and visualization purposes, the data is often needed to be available on a regular grid, which requires spatial interpolation of the scattered point measurements. A variety of interpolation methods for these purposes is available, examples are […]

OpenCL

Feb, 14

Cross Teaching Parallelism and Ray Tracing: A Project-based Approach to Teaching Applied Parallel Computing

Massively parallel Graphics Processing Unit (GPU) hardware has become increasingly powerful, available and affordable. Software tools have also advanced to the point that programmers can write general purpose parallel programs that take advantage of the large number of compute cores available in the hardware. With literally hundreds of compute cores available on a single device, […]

CUDA

Feb, 14

Acceleration of information-theoretic data analysis with graphics processing units

Information-theoretic measures are frequently employed to assess the degree of feature interactions when mining attribute-value data sets. For large data sets, obtaining these measures quickly poses an unmanageable computational burden. In this work we examine the applicability of consumer graphics processing units supporting CUDA architecture to speed-up the computation of information-theoretic measures. Our implementation was […]

CUDA

Feb, 14

Accelerated People Tracking Using Texture in a Camera Network

We present an approach to tracking multiple human subjects within a camera network. A particle filter framework is used in which we combine foreground-background subtraction with a novel approach to texture learning and likelihood computation based on an ellipsoid model. As there are inevitable problems with multiple subjects due to occlusion and crossing, we include […]

CUDA

Feb, 13

A Scalable GPU-based Approach to Accelerate the Multiple-Choice Knapsack Problem

Variants of the 0-1 knapsack problem manifest themselves at the core of several system-level optimization problems. The running times of such system-level optimization techniques are adversely affected because the knapsack problem is NP-hard. In this paper, we propose a new GPU-based approach to accelerate the multiple-choice knapsack problem, which is a general version of the […]

CUDA

Feb, 13

Using Graphical Processing Units in Scheduling Problems

Scheduling problems exist everywhere in the so-called "real world". They are there in manufacturing, transportation and logistics as well. The main object of these problems is to find an optimal sequence of tasks to be able to fulfil predefined objectives. There are efficient methods to solve complex scheduling problems in science and industry, which methods […]

CUDA

Feb, 13

Work Stealing Inside GPUs

Graphics Processing units have become a valuable support for High Performance Computing (HPC) applications. However, despite the many improvements on the General Purpose GPU, there is still the need of a generic programming model adaptable to the many forms of parallelism that an application can express. The CUDA programming model is widely used on the […]

CUDA

Feb, 13

LAMMPScuda – a new GPU accelerated Molecular Dynamics Simulations Package and its Application to Ion-Conducting Glasses

Today, computer simulations form an integral part of many research and development efforts. The scope of what can be modeled has increased dramatically, as computing performance improved over the last two decades. But with serial-execution performance of CPUs leveling off, future performance increases for computational physics, material design, and biology must come from higher parallelization. […]

CUDA

Feb, 13

Analytic Anti-Aliasing of Linear Functions on Polytopes

This paper presents an analytic formulation for anti-aliased sampling of 2D polygons and 3D polyhedra. Our framework allows the exact evaluation of the convolution integral with a linear function defined on the polytopes. The filter is a spherically symmetric polynomial of any order, supporting approximations to refined variants such as the Mitchell-Netravali filter family. This […]

CUDA

Feb, 12

Recursive MIS Computation for Streaming BDPT on the GPU

Bidirectional Path Tracing (BDPT) is a robust unbiased rendering algorithm that samples paths by connecting eye and light paths. By optimally combining different sampling strategies using Multiple Importance Sampling (MIS), BDPT efficiently renders scenes with complex light effects. However, BDPT does not map well on a streaming architecture such as the GPU; Stochastic path lengths […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Graph Coarsening and Clustering on the GPU

Robust Real-Time Multiprocessor Interrupt Handling Motivated by GPUs

Spatial interpolation of scattered geoscientific data

Cross Teaching Parallelism and Ray Tracing: A Project-based Approach to Teaching Applied Parallel Computing

Acceleration of information-theoretic data analysis with graphics processing units

Accelerated People Tracking Using Texture in a Camera Network

A Scalable GPU-based Approach to Accelerate the Multiple-Choice Knapsack Problem

Using Graphical Processing Units in Scheduling Problems

Work Stealing Inside GPUs

LAMMPScuda – a new GPU accelerated Molecular Dynamics Simulations Package and its Application to Ion-Conducting Glasses

Analytic Anti-Aliasing of Linear Functions on Polytopes

Recursive MIS Computation for Streaming BDPT on the GPU

Recent source codes

XaaS containers

microSYCL: SYCL micro-benchmarks repository

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

PELSI: Power-Efficient Layer-Switched Inference

Ouroboros: Virtualized Queues for dynamic memory management

Most viewed papers (last 30 days)