Posts
Dec, 1
GPU Computing for Particle Tracking
This is a feasibility study of using a modern Graphics Processing Unit (GPU) to parallelize the accelerator particle tracking code. To demonstrate the massive parallelization features provided by GPU computing, a simplified TracyGPU program is developed for dynamic aperture calculation. Performances, issues, and challenges from introducing GPU are also discussed.
Dec, 1
Optimal similarity registration of volumetric images
This paper proposes a novel approach to optimally solve volumetric registration problems. The proposed framework exploits parametric dictionaries for sparse volumetric representations, l1 dissimilarities and DC (Difference of Convex functions) decomposition. The SAD (sum of absolute differences) criterion is applied to the sparse representation of the reference volume and a DC decomposition of this criterion […]
Dec, 1
Image and Video Processing on CUDA: State of the Art and Future Directions
In the last few years a myriad of computer graphic applications have been developed using standard programming techniques, which are mainly based on multicore general-purpose processors (CPUs) architectures. Due to the rapid turning towards high definition multimedia, more and more researches have been done that need both computational resources and memory space to achieve high […]
Dec, 1
Numerical investigations on nonlinear nonparaxial beam propagation using graphics processing units
We study the performance of a nonparaxial beam propagation method accelerated using massively parallel computation in graphic processing units. The implementation is tested in two different NVIDIA hardware architectures, Tesla and Fermi, and the results are compared with a CPU-based parallel implementation using Open MPI.
Nov, 30
Architecture-Aware Algorithms and Software for Peta and Exascale Computing
Summary form only given. In this talk we examine how high performance computing has changed over the last 10-years and look toward the future in terms of trends. These changes have had and will continue to have a major impact on our software. Some of the software and algorithm challenges have already been encountered, such […]
Nov, 30
Support Operator Rupture Dynamics on GPU
The method of Support Operator (SOM) is a numerical method to simulate seismic wave propagation by solving the three dimension vsicoelastic equations. Its implementation, the Support Operator Rupture Dynamics (SORD) has been proved to be highly scalable in large-scale multi-processors calulations. This paper discusses accelarating SORD using on GPU using NVIDIA CUDA C. Compared to […]
Nov, 30
Using CUDA for Exhaustive Password Recovery
In the practical usage of cryptography, if one wish to decrypt some data without knowing the secret key that has been used for the encryption, one usually does not try to break the underlaying cryptographic construction, nor does one try all possible keys. What is mostly done is to try to find the password that […]
Nov, 30
GPU Accelerated Dissipative Particle Dynamics with Parallel Cell-list Updating
A general purpose DPD simulation entirely implemented on GPU is presented in this paper, including cell-list updating, force calculation and integrating forward. The algorithm and optimization needed to obtain best performance of GPU is discussed. The performance benchmarks show that our implementation running on single GPU can be more than 20x faster than conventional implementation […]
Nov, 30
Acceleration of computational quantum chemistry by heterogeneous computer architectures
Computational quantum chemistry mehods such as the Hartree-Fock (HF), the density functional theory (DFT) or the fragment molecular orbital (FMO) require heavy computational resources. In this study they are accelerated by using graphics processing units (GPUs) and the vector instruction set (AVX) of latest CPU. PRISM algorithm to evaluate the electron repulsion integrals was vectorized […]
Nov, 30
Optimization of the Particle-based Volume Rendering for GPUs with Hiding Data Transfer Latency
In this paper, we present the optimization of the particle-based volume rendering for GPU platforms. In general, data transfer between CPU and GPU accompanies long latency. Using page lock memory of the CUDA runtime API, data area is selected so that the data transfer between CPU and GPU becomes faster to reduce the execution time. […]
Nov, 30
Performance and numerical accuracy evaluation of heterogeneous multicore systems for Krylov orthogonal basis computation
We study the numerical behavior of heterogeneous systems such as CPU with GPU or IBM Cell processors for some orthogonalization processes. We focus on the influence of the different floating arithmetic handling of these accelerators with Gram-Schmidt orthogonalization using single and double precision. We observe for dense matrices a loss of at worst 1 digit […]
Nov, 30
GPGPU Accelerated Cardiac Arrhythmia Simulations
Computational modeling of cardiac electrophysiology is a powerful tool for studying arrhythmia mechanisms. In particular, cardiac models are useful for gaining insights into experimental studies, and in the foreseeable future they will be used by clinicians to improve therapy for the patients suffering from complex arrhythmias. Such models are highly intricate, both in their geometric […]