Posts
Sep, 9
Phase Transition in 3d Heisenberg Spin Glasses with Strong Random Anisotropies, through a Multi-GPU Parallelization
We characterize the phase diagram of anisotropic Heisenberg spin glasses, finding both the spin and the chiral glass transition. We remark the presence of strong finite-size effects on the chiral sector. We find a unique phase transition for the chiral and spin glass sector, in the Universality class of Ising spin glasses. We focus on […]
Sep, 9
Implementation of PDE models of cardiac dynamics on GPUs using OpenCL
Graphical processing units (GPUs) promise to revolutionize scientific computing in the near future. Already, they allow almost real-time integration of simplified numerical models of cardiac tissue dynamics. However, the integration methods that have been developed so far are typically of low order and use single precision arithmetics. In this work, we describe numerical implementation of […]
Sep, 9
A GPU Implementation for Two-Dimensional Shallow Water Modeling
In this paper, we present a GPU implementation of a two-dimensional shallow water model. Water simulations are useful for modeling floods, river/reservoir behavior, and dam break scenarios. Our GPU implementation shows vast performance improvements over the original Fortran implementation. By taking advantage of the GPU, researchers and engineers will be able to study water systems […]
Sep, 9
GPU Accelerated Particle Visualization with Splotch
Splotch is a rendering algorithm for exploration and visual discovery in particle-based datasets coming from astronomical observations or numerical simulations. The strengths of the approach are production of high quality imagery and support for very large-scale datasets through an effective mix of the OpenMP and MPI parallel programming paradigms. This article reports our experiences in […]
Sep, 9
Fast Detection of Overlapping Communities via Online Tensor Methods on GPUs
We present a scalable tensor-based approach for detecting hidden overlapping communities under the mixed membership stochastic block model. We employ stochastic gradient descent for performing tensor decompositions, which provides flexibility to tradeoff node sub-sampling with accuracy. Our GPU implementation of the tensor-based approach is extremely fast and scalable, and involves a careful optimization of GPU-CPU […]
Sep, 9
Acceleration of iterative Navier-Stokes solvers on graphics processing units
While new power-efficient computer architectures exhibit spectacular theoretical peak performance, they require specific conditions to operate efficiently, which makes porting complex algorithms a challenge. Here, we report results of the semi-implicit method for pressure linked equations (SIMPLE) and the pressure implicit with operator splitting (PISO) methods implemented on the graphics processing unit (GPU). We examine […]
Sep, 7
A Bi-objective Optimization Framework for Query Plans
Graphics Processing Units (GPU) have significantly more applications than just rendering images. They are also used in general-purpose computing to solve problems that can benefit from massive parallel processing. However, there are tasks that either hardly suit GPU or fit GPU only partially. The latter class is the focus of this paper. We elaborate on […]
Sep, 7
GPU-based simulation of brain neuron models
The human brain is an incredible system which can process, store, and transfer information with high speed and volume. Inspired by such system, engineers and scientists are cooperating to construct a digital brain with these characteristics. The brain is composed by billions of neurons which can be modeled by mathematical equations. The first step to […]
Sep, 7
Comparison and Analysis of GPU Energy Effciency For CUDA and OpenCL
The use of GPUs for processing large sets of parallelizable data has increased sharply in recent years. As the concept of GPU computing is still relatively young, parameters other than computation time, such as energy eciency, are being overlooked. Two parallel computing platforms, CUDA and OpenCL, provide developers with an interface that they can use […]
Sep, 7
D5.5.3 – Design and implementation of the SIMD-MIMD GPU architecture
To develop a new SIMD-MIMD architecture we first characterized GPGPU workloads using simple and well known workload metrics to identify the performance bottlenecks. We found that the benchmarks with branch divergence do not utilize the SIMD width optimally on conventional GPUs. We also studied the performance bottlenecks of motion compensation kernel developed in Task 3.2 […]
Sep, 7
Combining recent HPC techniques for 3D geophysics acceleration
Reverse Time Migration technique produces underground images using wave propagation. A discretization based on the Discontinuous Galerkin Method unleashes a massively parallel elastodynamics simulation, an interesting feature for current and future architectures. In this work, we propose to combine two recent HPC techniques to achieve a high level of efficiency: the use of runtimes (StarPU […]
Sep, 6
D5.5.2 – Architectural Techniques to exploit SLACK & ACCURACY trade-offs
In this work we are (a) exploring memory slack for the state-of-the-art many-core CPUs and GPUs, (b) present techniques to eliminate slack, and (c) explore the architectural parameters to improve power eciency. Dynamic Voltage-Frequency Scaling (DVFS) is one of the most benecial techniques for CPU’s to improve power eciency. The end of Dennard scaling however, […]