high performance computing on graphics processing units: hgpu.org

Posts

Jul, 14

A Survey of Cloud Lighting and Rendering Techniques

The rendering of participating media still forms a big challenge for computer graphics. This remark is particularly true for real-world clouds with their inhomogeneous density distributions, large range of spatial scales and different forms of appearance. We survey techniques for cloud visualization and classify them relative to the type of volume representation, lighting and rendering […]

Jul, 14

GPU Based Computation of the Structural Tensor for Real-Time Figure Detection

In this paper we present a real-time realization of the method of detection of local structures in images of predefined orientation. The method is based on an analysis of the structural tensor computed in monochrome and color images. Thanks to the GPU implementation of the low-level feature detection an order-of-magnitude speed-up was achieved compared to […]

CUDA

Jul, 13

International Conference on Parallel Computing 2013, ParCo2013

ParCo2013 continues the tradition of the international conferences on parallel computing started in Berlin, Germany in 1983. This makes it one of the longest running international conferences on parallel computing. Over the years the conference established itself as the foremost platform for exchanging know-how on the newest parallel computing strategies, technologies, methods and tools. The […]

Jul, 13

International Conference on Computational Physics, ICCP 2013

The XXXIV International Conference on Computational Physics is the premier forum for the presentation of new advances and research results in the fields of Computational Physics. The conference will bring together leading academic scientists, researchers and scholars in the domain of interest from around the world. Topics of interest for submission include, but are not […]

Jul, 13

GPU Technology Conference 2013, GTC 2013

GTC advances awareness of high performance computing, and connects the scientists, developers, graphic artists, designers, researchers, engineers, and IT managers who use GPUs to tackle enormous computational challenges. GTC 2013 will feature the latest breakthroughs and the most amazing content in GPU-enabled computing. Spanning 4 full days of world-class education delivered by some of the […]

Jul, 13

Linearised inversion with GPUs

Graphical Processing Units (GPUs) can provide considerable computational advantages over multi-core CPU nodes or distributed networks by locally accelerating certain types of floating point operations. However, when processing and inverting exploration scale seismic datasets we encounter two key problems – compounded disk IO (explicit routing through the host is necessary) and the relatively small memory […]

CUDA

Jul, 13

Real-Time Implementation of the Vertex Component Analysis Algorithm on GPUs

In this letter, we present a new parallel implementation of the vertex component analysis (VCA) algorithm for spectral unmixing of remotely sensed hyperspectral data on commodity graphics processing units. We first developed a C serial version of the VCA algorithm and three parallel versions: one using NVIDIA’s Compute Unified Device Architecture (CUDA), another using CUDA […]

CUDA

Jul, 13

Development of Parallel Computation Tools

In this project, boundary value problems of the electric field governed by the Laplace equation were formulated using different numerical methods such as FEM and BEM. The resulting systems of linear equations were then solved using different solving algorithms. The accuracy and complexity of FEM and BEM were compared. The space and time complexity of […]

CUDA

Jul, 13

A stand-alone Finite Difference Time Domain (FDTD) simulation for Integrated Optoelectronics Laboratory

Numerical solution models to Maxwell’s equations, which describe electromagnetic wave propagation phenomenon with complete clarity, are of atmost importance in pre-fabrication simulation analyses of the photonic and optoelectronic devices. The Finite Difference Time Domain (FDTD) method, which is based on modeling the differential equations as difference equations in a discretized domain in both space and […]

CUDA

Jul, 13

Fusion of Morphological Images for Airborne Target Detection

Several track-before-detection approaches for image based aircraft detection have recently been examined in an important automated aircraft collision detection application. A particularly popular approach is a two stage processing paradigm which involves: a morphological spatial filter stage (which aims to emphasize the visual characteristics of targets) followed by a temporal or track filter stage (which […]

CUDA

Jul, 12

Towards Parallel Programming Models for Predictability

Future embedded systems for performance-demanding applications will be massively parallel. High performance tasks will be parallel programs, running on several cores, rather than single threads running on single cores. For hard real-time applications, WCETs for such tasks must be bounded. Low-level parallel programming models, based on concurrent threads, are notoriously hard to use due to […]

CUDA

•

OpenCL

Jul, 12

CUDA implementation of the algorithm for simulating the epidemic spreading over large networks

For some years now, there has been an increasing interest in modeling and analyzing the spread of epidemics in both human and computer networks. The obvious advantage a computer simulation of the epidemic spread offers is that the answer is delivered in short time and the number of hosts included in simulation can approach their […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

A Survey of Cloud Lighting and Rendering Techniques

GPU Based Computation of the Structural Tensor for Real-Time Figure Detection

International Conference on Parallel Computing 2013, ParCo2013

International Conference on Computational Physics, ICCP 2013

GPU Technology Conference 2013, GTC 2013

Linearised inversion with GPUs

Real-Time Implementation of the Vertex Component Analysis Algorithm on GPUs

Development of Parallel Computation Tools

A stand-alone Finite Difference Time Domain (FDTD) simulation for Integrated Optoelectronics Laboratory

Fusion of Morphological Images for Airborne Target Detection

Towards Parallel Programming Models for Predictability

CUDA implementation of the algorithm for simulating the epidemic spreading over large networks

Recent source codes

Kernel Library for LLM Serving

Adaptivity in AdaptiveCpp: Optimizing Performance by Leveraging Runtime Information During JIT-Compilation

Neptune: Advanced ML Operator Fusion for Locality and Parallelism on GPUs

Genten: Software for Generalized Tensor Decompositions by Sandia National Laboratories

Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR

Pinocchio: PINpointing Orbit Crossing Collapsed Hierarchical Objects

KernelCoder: trained on a curated dataset of reasoning traces and CUDA kernel pairs

VibeCodeHPC - Multi Agentic Vibe Coding for HPC

Compile-Time Resource Safety for GPU APIs: A Low-Overhead Typestate Framework

exa-AMD: Exascale Accelerated Materials Discovery

Most viewed papers (last 30 days)