Posts
Jun, 24
Provably Efficient GPU Algorithms
In this paper we present an abstract model for algorithm design on GPUs by extending the parallel external memory (PEM) model with computations in internal memory (commonly known as shared memory in GPU literature) defined in the presence of memory banks and bank conflicts. We also present a framework for designing bank conflict free algorithms […]
Jun, 23
The 22nd High Performance Computing Symposium, HPC 2014
The 2014 Spring Simulation Multiconference will feature the 22nd High Performance Computing Symposium (HPC 2014), devoted to the impact of high performance computing and communications on computer simulations. Advances in multicore and many-core architectures, networking, high end computers, large data stores, and middleware capabilities are ushering in a new era of high performance parallel and […]
Jun, 23
Workshop on GPU Programming for Molecular Modeling
The GPU Programming for Molecular Modeling workshop will extend GPU programming techniques to the field of molecular modeling, including subjects such as particle-grid algorithms (electrostatics, molecular surfaces, density maps, and molecular orbitals), particle-particle algorithms with an emphasis on non-bonded force calculations, radial distribution functions in GPU histogramming, single-node multi-GPU algorithms, and GPU clusters. Specific examples […]
Jun, 23
Non-Uniformly Partitioned Block Convolution on Graphics Processing Units
Real time convolution has many applications among others simulating room reverberation in audio processing. Non-uniformly partitioning filters could satisfy the both desired features of having a low latency and less computational complexity for an efficient convolution. However, distributing the computation to have an uniform demand on Central Processing Unit (CPU) is still challenging. Moreover, computational […]
Jun, 23
GPU Implementation of the DP code
Main goal of this PRACE project was to evaluate how GPUs could speed up the DP code – a linear response TDDFT code. Profiling analysis of the code has been done to identify computational bottlenecks to be delegated to the GPU. In order to speed up this code using GPUs, two different strategies have been […]
Jun, 22
CUDA Enhanced Simulated Annealing for Chip Layout Problem
This paper introduces an implementation of a parallel solution for the chip layout problem on an NVidia CUDA framework. The experiment allows for varying chip sizes, interconnecting signals, and three chip transformations: rotate, swap, and translate. Total signal distance is minimized as the system converges toward an optimal solution using simulated annealing. Lee’s maze routing […]
Jun, 22
Exploring GPGPUs Workload Characteristics and Power Consumption
While general purpose computing on GPUs continues to enjoy higher computing performance with every new generation. The high power consumption of GPUs is an increasingly important concern. To create power-efficient GPUs, it is important to thoroughly study its power consumption. The power consumption of GPUs varies significantly with workloads. Therefore, in this work we study […]
Jun, 22
Virtualization and Migration with GPGPUs
Recently, cloud computing providers have started to offer virtual machines specifically for high performance computing as a service (HPCaaS). The cloud computing providers usually employ virtualization as an abstraction layer between the application software and the underlying hardware. Virtualization allows flexible migration between physical systems, which is a requirement for many load balancing techniques. In […]
Jun, 21
GPU Optimized Code for Long Term Simulations of Beam-beam Effects in Colliders
We report on the development of a new code for long-term simulation of beam-beam effects in particle colliders. The underlying physical model relies on a matrix-based arbitrary-order symplectic particle tracking for beam transport and the Bassetti-Erskine approximation for the beam-beam interaction. The computations are accelerated through a parallel implementation on a hybrid GPU/CPU platform. With […]
Jun, 21
Parallel Language Programming In Different Platforms
The need to speed-up computing has introduced the interest to explore parallelism in algorithms and parallel programming. Technology is evolving fast but computing power in sequential execution is not increasing as much as earlier but CPUs contain more and more parallel computing resources. However, parallel algorithms may not be able to exploit all the parallelism […]
Jun, 21
Beam Dynamics Simulations with a GPU-accelerated Version of ELEGANT
Large scale beam dynamics simulations can derive significant benefit from efficient implementation of general-purpose particle tracking on GPUs. We present the latest results of our work on accelerating Argonne National Lab’s accelerator simulation code ELEGANT, using CUDA-enabled GPUs. We summarize the performance of beamline elements ported to GPU, and discuss optimization techniques for some core […]
Jun, 21
Applying the “Simple Accelerator Modelling in MATLAB” (SAMM) Code to High Luminosity LHC Upgrade
The “Simple Accelerator Modelling in Matlab” (SAMM) code is a set of Matlab routines for modelling beam dynamics in high energy particle accelerators. It includes a set of CUDA codes that can be run on a graphics processing unit. These can be called from SAMM and can potentially give a significant increase in tracking speed. […]