15278

Posts

Jan, 14

Odyssey: A Public GPU-Based Code for General-Relativistic Radiative Transfer in Kerr Spacetime

General-relativistic radiative transfer (GRRT) calculations coupled with the calculation of geodesics in the Kerr spacetime are an essential tool for determining the images, spectra and light curves from matter in the vicinity of black holes. Such studies are especially important for ongoing and upcoming millimeter/submillimeter (mm/sub-mm) Very Long Baseline Interferometry (VLBI) observations of the supermassive […]
Jan, 14

Aging in the three-dimensional Random Field Ising Model

We studied the nonequilibrium aging behavior of the Random Field Ising Model in three dimensions for various values of the disorder strength. This allowed us to investigate how the aging behavior changes across the ferromagnetic-paramagnetic phase transition. We investigated a large system size of $N=256^3$ spins and up to $10^8$ Monte Carlo sweeps. To reach […]
Jan, 14

A Case for Work-stealing on FPGAs with OpenCL Atomics

We provide a case study of work-stealing, a popular method for run-time load balancing, on FPGAs. Following the Cederman-Tsigas implementation for GPUs, we synchronize workitems not with locks, mutexes or critical sections, but instead with the atomic operations provided by Altera’s OpenCL SDK. We evaluate work-stealing for FPGAs by synthesizing a K-means clustering algorithm on […]
Jan, 14

Classification of Higgs Boson Tau-Tau decays using GPU accelerated Neural Networks

In particle physics, Higgs Boson to tau-tau decay signals are notoriously difficult to identify due to the presence of severe background noise generated by other decaying particles. Our approach uses neural networks to classify events as signals or background noise.
Jan, 14

A Survey Of Techniques for Approximate Computing

Approximate computing trades off computation quality with the effort expended and as rising performance demands confront with plateauing resource budgets, approximate computing has become, not merely attractive, but even imperative. In this paper, we present a survey of techniques for approximate computing (AC). We discuss strategies for finding approximable program portions and monitoring output quality, […]
Jan, 12

Real-Time Dedispersion for Fast Radio Transient Surveys, using Auto Tuning on Many-Core Accelerators

Dedispersion, the removal of deleterious smearing of impulsive signals by the interstellar matter, is one of the most intensive processing steps in any radio survey for pulsars and fast transients. We here present a study of the parallelization of this algorithm on many-core accelerators, including GPUs from AMD and NVIDIA, and the Intel Xeon Phi. […]
Jan, 12

Study of low density nuclear matter with quantum molecular dynamics: the role of the symmetry energy

We study the effect of isospin-dependent nuclear forces on the pasta phase in the inner crust of neutron stars. To this end we model the crust within the framework of quantum molecular dynamics (QMD). For maximizing the numerical performance, the newly developed code has been implemented on GPU processors. As a first application of the […]
Jan, 12

GPU Remote Memory Access Programming

High performance computing studies the construction and programming of computing system with tremendous computational power playing a key role in scientific computing and research across disciplines. The graphics processing unit (GPU) developed for fast 2D and 3D visualizations has turned into a programmable general purpose accelerator device boosting today’s high performance clusters. Leveraging these computational […]
Jan, 12

A Workload Balanced MapReduce Framework on GPU Platforms

The MapReduce framework is a programming model proposed by Google to process large datasets. It is an efficient framework that can be used in many areas, such as social network, scientific research, electronic business, etc. Hence, more and more MapReduce frameworks are implemented on different platforms, including Phoenix (based on multicore CPU), MapCG (based on […]
Jan, 7

GPU-Based Fuzzy C-Means Clustering Algorithm for Image Segmentation

In this paper, a fast and practical GPU-based implementation of Fuzzy C-Means (FCM) clustering algorithm for image segmentation is proposed. First, an extensive analysis is conducted to study the dependency among the image pixels in the algorithm for parallelization. The proposed GPU-based FCM has been tested on digital brain simulated dataset to segment white matter(WM), […]
Jan, 7

Computationally Efficient Tsunami Modelling on Graphics Processing Units (GPU)

Tsunamis generated by earthquakes commonly propagate as long waves in the deep ocean and develop into sharp-fronted surges moving rapidly towards the coast in shallow water, which may be effectively simulated by hydrodynamic models solving the nonlinear shallow water equations (SWEs). However, most of the existing tsunami models suffer from long simulation time for large-scale […]
Jan, 7

Verifying CUDA Programs using SMT-Based Context-Bounded Model Checking

We present ESBMC-GPU, an extension to the ESBMC model checker that is aimed at verifying GPU programs written for the CUDA framework. ESBMC-GPU uses an operational model for the verification, i.e., an abstract representation of the standard CUDA libraries that conservatively approximates their semantics. ESBMC-GPU verifies CUDA programs, by explicitly exploring the possible interleavings (up […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: