Posts
Jun, 18
An Out-of-core GPU Approach for Accelerating Geostatistical Interpolation
Geostatistical methods provide a powerful tool to understand the complexity of data arising from Earth sciences. Since the mid 70’s, this numerical approach is widely used to understand the spatial variation of natural phenomena in various domains like Oil and Gas, Mining or Environmental Industries. Considering the huge amount of data available, standard implementations of […]
Jun, 18
A Case Against Small Data Types on GPGPUs
In this paper, we study application behavior in GPGPUs. We investigate how data type impacts performance in different applications. As we show, expectedly, some applications can take significant advantage of small data types. Such applications benefit from small data types as a result of increasing cache effective capacity, reducing memory pressure, access latency, and memory […]
Jun, 18
Computing on Knights and Kepler Architectures
A recent trend in scientific computing is the increasingly important role of co-processors, originally built to accelerate graphics rendering, and now used for general high-performance computing. The INFN Computing On Knights and Kepler Architectures (COKA) project focuses on assessing the suitability of co-processor boards for scientific computing in a wide range of physics applications, and […]
Jun, 18
Expansion Techniques for Collisionless Stellar Dynamical Simulations
We present GPU implementations of two fast force calculation methods, based on series expansions of the Poisson equation. One is the Self-Consistent Field (SCF) method, which is a Fourier-like expansion of the density field in some basis set; the other is the Multipole Expansion (MEX) method, which is a Taylor-like expansion of the Green’s function. […]
Jun, 17
On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures
With the advent of many-core computer architectures such as GPGPUs from NVIDIA and AMD, and more recently Intel’s Xeon Phi, ensuring performance portability of HPC codes is potentially becoming more complex. In this work we have focused on one important application area — structured grid codes — and investigated techniques for ensuring performance portability across […]
Jun, 17
A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures
The architecture of high performance computing systems is becoming more and more heterogeneous, as accelerators play an increasingly important role alongside traditional CPUs. Programming heterogeneous systems efficiently is a complex task, that often requires the use of specific programming environments. Programming frameworks supporting codes portable across different high performance architectures have recently appeared, but one […]
Jun, 17
An Improved Monte Carlo Ray Tracing for Large-Scale Rendering in Hadoop
To improve the performance of large-scale rendering, it requires not only a good view of data structure, but also less disk and network access, especially for achieving the realistic visual effects. This paper presents an optimization method of global illumination rendering for large datasets. We improved the previous rendering algorithm based on Monte Carlo ray […]
Jun, 17
A CUDA based Solution to the Multidimensional Knapsack Problem Using the Ant Colony Optimization
The Multidimensional Knapsack Problem (MKP) is a generalization of the basic Knapsack Problem, with two or more constraints. It is an important optimization problem with many real-life applications. To solve this NP-hard problem we use a metaheuristic algorithm based on ant colony optimization (ACO). Since several steps of the algorithm can be carried out concurrently, […]
Jun, 17
HAM – Heterogenous Active Messages for Efficient Offloading on the Intel Xeon Phi
The applicability of accelerators is limited by the attainable speed-up for the offloaded computations and by the offloading overheads. While GPU programming models like CUDA and OpenCL only allow to optimise the application code and its speed-up, the available low-level APIs for the Intel Xeon Phi provide opportunity to address the overheads, too. This work […]
Jun, 17
GPU Implementation of Bayesian Neural Network Construction for Data-Intensive Applications
We describe a graphical processing unit (GPU) implementation of the Hybrid Markov Chain Monte Carlo (HMC) method for training Bayesian Neural Networks (BNN). Our implementation uses NVIDIA’s parallel computing architecture, CUDA. We briefly review BNNs and the HMC method and we describe our implementations and give preliminary results.
Jun, 17
Synergia CUDA: GPU-accelerated accelerator modeling package
Synergia is a parallel, 3-dimensional space-charge particle-in-cell accelerator modeling code. We present our work porting the purely MPI-based version of the code to a hybrid of CPU and GPU computing kernels. The hybrid code uses the CUDA platform in the same framework as the pure MPI solution. We have implemented a lock-free collaborative charge-deposition algorithm […]
Jun, 16
Divide and Conquer G-Buffer Ray Tracing
Many real time computer graphics applications strive for realism, though they have difficulty achieving reflections that are fast, respond to scene changes, and work on a variety of surfaces. This thesis explores an alternative to existing techniques for real time reflections. Ray tracing, a slow technique that does well at physically modelling light, is combined […]