Posts
Aug, 3
AQUAgpusph, a free 3D SPH solver accelerated with OpenCL
In this paper AQUAgpusph, a new free SPH software licensed under GPLv3 and accelerated using OpenCL, will be described. Its main differences with respect to other GPU based SPH implementations will be discussed, focusing first on the fact that is accelerated with OpenCL, second on the wide range of solid boundary condition enforcing methods have […]
Aug, 3
Strategies for Optimization of Parallel Programs
Multi-core processors are present in most forms of computing, from a pocket-size smartphone to supercomputers. Consequently, parallel and concurrent programming has reemerged as a pressing concern for everyone interested in exploring all the potential computational power in these machines. Writing parallel, and specially concurrent, programs is not a trivial task as it requires a different […]
Aug, 2
Real-Time Electroholography Using a Multi-GPU Environmental PC
We report a real-time electroholography using compact system composed of a multi-GPU environmental PC with four GPUs of Kepler architecture. Finally, our system can calculate 1,920×1,024 pixel CGH from the 3D object composed of 10,240 points in 40.3ms.
Aug, 2
DRiVE: An Example of Distributed Rendering in Virtual Environments
Most Virtual Reality (VR) applications use rendering methods which implement local illumination models, simulating only direct interaction of light with 3D objects. They do not take into account the energy exchange between the objects themselves, making the resulting images look non-optimal. The main reason for this is the simulation of global illumination having a high […]
Aug, 2
Large-Scale Sound Field Rendering in Rectangular Room with Specular Reflection
The sound field rendering is a technique to compute the sound field from the three-dimensional numerical models constructed in the computer, and it is the same concept as the graphics rendering in the computer graphics. In this paper, a GPU (Graphics Processing Unit) cluster system is applied to the sound field rendering for a large […]
Aug, 2
Comparing CUDA, OpenCL and OpenGL Implementations of the Cardiac Monodomain Equations
Computer simulations of cardiac electrophysiology are a helpful tool in the study of bioelectric activity of the heart. The cardiac monodomain model comprises a nonlinear system of partial differential equations and its numerical solution represents a very intensive computational task due to the required fine spatial and temporal resolution. Recent studies have shown that the […]
Aug, 1
NOVA: A Functional Language for Data Parallelism
Functional languages provide a solid foundation on which complex optimization passes can be designed to exploit available parallelism in the underlying system. Their mathematical foundations enable high-level optimizations that would be impossible in traditional imperative languages. This makes them uniquely suited for generation of efficient target code for parallel systems, such as multiple Central Processing […]
Aug, 1
Accelerating BIRCH for Clustering Large Scale Streaming Data Using CUDA Dynamic Parallelism
In this big data era, the capability of mining and analyzing large scale datasets is imperative. As data are becoming more abundant than ever before, data driven methods are playing a critical role in areas such as decision support and business intelligence. In this paper, we demonstrate how state-of-the-art GPUs and the Dynamic Parallelism feature […]
Aug, 1
A note on the GPU acceleration of eigenvalue computations
Eigenvalue computations for large sparse matrices such as the Lanczos method are commonly based on Krylov subspace techniques. One of the dominant operations in such algorithms are iterated computations of inner products with the same vector in order to preserve orthogonality of the Krylov basis. These operations can be accelerated by existing BLAS functionality using […]
Aug, 1
Matrix Convolution using Parallel Programming
The convolution theorem is used to multiply matrices of two different sizes i.e. matrices in which the number of rows in the first matrix is not equal to the number of columns in the second matrix. In this study, the multiplication of 3*3 and 4*4 matrices was done using MPI. A 3*3 matrix was taken […]
Aug, 1
GPU peer-to-peer techniques applied to a cluster interconnect
Modern GPUs support special protocols to exchange data directly across the PCI Express bus. While these protocols could be used to reduce GPU data transmission times, basically by avoiding staging to host memory, they require specific hardware features which are not available on current generation network adapters. In this paper we describe the architectural modifications […]
Jul, 31
Iterative CT Reconstruction on the GPU
The computing power of modern GPUs makes them very suitable for Computed Tomography (CT) image reconstruction. Apart from accelerating the reconstruction, their extra computing performance compared to conventional CPUs can be used to increase image quality in several ways. In this paper we present our upgraded GPU based iterative reconstruction algorithm, including ML-TR (Maximum Likelihood […]