Posts
Jun, 23
Non-Uniformly Partitioned Block Convolution on Graphics Processing Units
Real time convolution has many applications among others simulating room reverberation in audio processing. Non-uniformly partitioning filters could satisfy the both desired features of having a low latency and less computational complexity for an efficient convolution. However, distributing the computation to have an uniform demand on Central Processing Unit (CPU) is still challenging. Moreover, computational […]
Jun, 23
GPU Implementation of the DP code
Main goal of this PRACE project was to evaluate how GPUs could speed up the DP code – a linear response TDDFT code. Profiling analysis of the code has been done to identify computational bottlenecks to be delegated to the GPU. In order to speed up this code using GPUs, two different strategies have been […]
Jun, 22
CUDA Enhanced Simulated Annealing for Chip Layout Problem
This paper introduces an implementation of a parallel solution for the chip layout problem on an NVidia CUDA framework. The experiment allows for varying chip sizes, interconnecting signals, and three chip transformations: rotate, swap, and translate. Total signal distance is minimized as the system converges toward an optimal solution using simulated annealing. Lee’s maze routing […]
Jun, 22
Exploring GPGPUs Workload Characteristics and Power Consumption
While general purpose computing on GPUs continues to enjoy higher computing performance with every new generation. The high power consumption of GPUs is an increasingly important concern. To create power-efficient GPUs, it is important to thoroughly study its power consumption. The power consumption of GPUs varies significantly with workloads. Therefore, in this work we study […]
Jun, 22
Virtualization and Migration with GPGPUs
Recently, cloud computing providers have started to offer virtual machines specifically for high performance computing as a service (HPCaaS). The cloud computing providers usually employ virtualization as an abstraction layer between the application software and the underlying hardware. Virtualization allows flexible migration between physical systems, which is a requirement for many load balancing techniques. In […]
Jun, 21
GPU Optimized Code for Long Term Simulations of Beam-beam Effects in Colliders
We report on the development of a new code for long-term simulation of beam-beam effects in particle colliders. The underlying physical model relies on a matrix-based arbitrary-order symplectic particle tracking for beam transport and the Bassetti-Erskine approximation for the beam-beam interaction. The computations are accelerated through a parallel implementation on a hybrid GPU/CPU platform. With […]
Jun, 21
Parallel Language Programming In Different Platforms
The need to speed-up computing has introduced the interest to explore parallelism in algorithms and parallel programming. Technology is evolving fast but computing power in sequential execution is not increasing as much as earlier but CPUs contain more and more parallel computing resources. However, parallel algorithms may not be able to exploit all the parallelism […]
Jun, 21
Beam Dynamics Simulations with a GPU-accelerated Version of ELEGANT
Large scale beam dynamics simulations can derive significant benefit from efficient implementation of general-purpose particle tracking on GPUs. We present the latest results of our work on accelerating Argonne National Lab’s accelerator simulation code ELEGANT, using CUDA-enabled GPUs. We summarize the performance of beamline elements ported to GPU, and discuss optimization techniques for some core […]
Jun, 21
Applying the “Simple Accelerator Modelling in MATLAB” (SAMM) Code to High Luminosity LHC Upgrade
The “Simple Accelerator Modelling in Matlab” (SAMM) code is a set of Matlab routines for modelling beam dynamics in high energy particle accelerators. It includes a set of CUDA codes that can be run on a graphics processing unit. These can be called from SAMM and can potentially give a significant increase in tracking speed. […]
Jun, 21
A Numerical Study of Continuous Data Assimilation for the 2D-NS Equations Using Nodal Points
This thesis conducts a number of numerical experiments using massively parallel GPU computations to study a new continuous data assimilation algorithm. We test the algorithm on two-dimensional incompressible fluid flows given by the Navier-Stokes equations. In this context, observations of the Eulerian velocity field given at a finite resolution of nodal points in space may […]
Jun, 21
libCudaOptimize: an Open Source Library of GPU-based Metaheuristics
Evolutionary Computation techniques and other metaheuristics have been increasingly used in the last years for solving many real-world tasks that can be formulated as optimization problems. Among their numerous strengths, a major one is their natural predisposition to parallelization. In this paper, we introduce libCudaOptimize, an open source library which implements some metaheuristics for continuous […]
Jun, 21
CFMDS: CUDA-based fast multidimensional scaling for genome-scale data
BACKGROUND: Multidimensional scaling (MDS) is a widely used approach to dimensionality reduction. It has been applied to feature selection and visualization in various areas. Among diverse MDS methods, the classical MDS is a simple and theoretically sound solution for projecting data objects onto a low dimensional space while preserving the original distances among them as […]