## Posts

Nov, 11

### Simulation of reaction-diffusion processes in three dimensions using CUDA

Numerical solution of reaction-diffusion equations in three dimensions is one of the most challenging applied mathematical problems. Since these simulations are very time consuming, any ideas and strategies aiming at the reduction of CPU time are important topics of research. A general and robust idea is the parallelization of source codes/programs. Recently, the technological development […]

Nov, 11

### High-performance astrophysical visualization using Splotch

The scientific community is presently witnessing an unprecedented growth in the quality and quantity of data sets coming from simulations and real-world experiments. To access effectively and extract the scientific content of such large-scale data sets (often sizes are measured in hundreds or even millions of Gigabytes) appropriate tools are needed. Visual data exploration and […]

Nov, 11

### Magnetohydrodynamics on Heterogeneous architectures: a performance comparison

We present magneto-hydrodynamic simulation results for heterogeneous systems. Heterogeneous architectures combine high floating point performance many-core units hosted in conventional server nodes. Examples include Graphics Processing Units (GPU’s) and Cell. They have potentially large gains in performance, at modest power and monetary cost. We implemented a magneto-hydrodynamic (MHD) simulation code on a variety of heterogeneous […]

Nov, 11

### CUDA simulations of active dumbbell suspensions

We describe and analyze CUDA simulations of hydrodynamic interactions in active dumbbell suspensions. GPU-based parallel computing enables us not only to study the time-resolved collective dynamics of up to a several hundred active dumbbell swimmers but also to test the accuracy of effective time-averaged models. Our numerical results suggest that the stroke-averaged model yields a […]

Nov, 11

### Reionization simulations powered by GPUs I: the structure of the Ultraviolet radiation field

We present a set of cosmological simulations with radiative transfer in order to model the reionization history of the Universe. Galaxy formation and the associated star formation are followed self-consistently with gas and dark matter dynamics using the RAMSES code, while radiative transfer is performed as a post-processing step using a moment-based method with M1 […]

Nov, 11

### Exact Sparse Matrix-Vector Multiplication on GPU’s and Multicore Architectures

We propose different implementations of the sparse matrix–dense vector multiplication (spmv{}) for finite fields and rings $Zb/mZb$. We take advantage of graphic card processors (GPU) and multi-core architectures. Our aim is to improve the speed of spmv{} in the linbox library, and henceforth the speed of its black box algorithms. Besides, we use this and […]

Nov, 11

### Ultra-fast treatment plan optimization for volumetric modulated arc therapy (VMAT)

Purpose: To develop a novel aperture-based algorithm for volumetric modulated arc therapy (VMAT) treatment plan optimization with high quality and high efficiency. Methods: The VMAT optimization problem is formulated as a large-scale convex programming problem solved by a column generation approach. We consider a cost function consisting two terms, the first which enforces a desired […]

Nov, 11

### MYRIAD: A new N-body code for simulations of Star Clusters

We present a new C++ code for collisional N-body simulations of star clusters. The code uses the Hermite fourth-order scheme with block time steps, for advancing the particles in time, while the forces and neighboring particles are computed using the GRAPE-6 board. Special treatment is used for close encounters, binary and multiple sub-systems that either […]

Nov, 11

### A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters

Sustaining a large fraction of single GPU performance in parallel computations is considered to be the major problem of GPU-based clusters. In this article, this topic is addressed in the context of a lattice Boltzmann flow solver that is integrated in the WaLBerla software framework. We propose a multi-GPU implementation using a block-structured MPI parallelization, […]

Nov, 11

### Toward large-scale Hybrid Monte Carlo simulations of the Hubbard model on graphics processing units

The performance of the Hybrid Monte Carlo algorithm is determined by the speed of sparse matrix-vector multiplication within the context of preconditioned conjugate gradient iteration. We study these operations as implemented for the fermion matrix of the Hubbard model in d+1 space-time dimensions, and report a performance comparison between a 2.66 GHz Intel Xeon E5430 […]

Nov, 10

### QYMSYM: A GPU-Accelerated Hybrid Symplectic Integrator That Permits Close Encounters

We describe a parallel hybrid symplectic integrator for planetary system integration that runs on a graphics processing unit (GPU). The integrator identifies close approaches between particles and switches from symplectic to Hermite algorithms for particles that require higher resolution integrations. The integrator is approximately as accurate as other hybrid symplectic integrators but is GPU accelerated.

Nov, 10

### Interactive Visualization of the Largest Radioastronomy Cubes

3D visualization is an important data analysis and knowledge discovery tool, however, interactive visualization of large 3D astronomical datasets poses a challenge for many existing data visualization packages. We present a solution to interactively visualize larger-than-memory 3D astronomical data cubes by utilizing a heterogeneous cluster of CPUs and GPUs. The system partitions the data volume […]