6518

Posts

Dec, 1

Iterative optimization methods for efficient image restoration on multicore architectures

This paper explores effective algorithms for the solution of numerical nonlinear optimization problems in image restoration. The technology of modern acquisition techniques and devices most often returns data of increasing size, so we focus on the Scaled Gradient Projection algorithm, which is well suited to large-scale applications. We present its parallel implementations on different hardware, […]
Dec, 1

Evaluation iterative solver for pCDR on GPU accelerator

In the past few years, the graphics processing units (GPU) has become trend in high performance computing (HPC). The newest Top500 list was showed three supercomputers contain GPU accelerator on Top10 in Nov. 2010. The role of the GPU accelerator has become more and more important for scientific computing and computational fluid dynamic (CFD) to […]
Dec, 1

GPU Computing for Particle Tracking

This is a feasibility study of using a modern Graphics Processing Unit (GPU) to parallelize the accelerator particle tracking code. To demonstrate the massive parallelization features provided by GPU computing, a simplified TracyGPU program is developed for dynamic aperture calculation. Performances, issues, and challenges from introducing GPU are also discussed.
Dec, 1

Optimal similarity registration of volumetric images

This paper proposes a novel approach to optimally solve volumetric registration problems. The proposed framework exploits parametric dictionaries for sparse volumetric representations, l1 dissimilarities and DC (Difference of Convex functions) decomposition. The SAD (sum of absolute differences) criterion is applied to the sparse representation of the reference volume and a DC decomposition of this criterion […]
Dec, 1

Image and Video Processing on CUDA: State of the Art and Future Directions

In the last few years a myriad of computer graphic applications have been developed using standard programming techniques, which are mainly based on multicore general-purpose processors (CPUs) architectures. Due to the rapid turning towards high definition multimedia, more and more researches have been done that need both computational resources and memory space to achieve high […]
Dec, 1

Numerical investigations on nonlinear nonparaxial beam propagation using graphics processing units

We study the performance of a nonparaxial beam propagation method accelerated using massively parallel computation in graphic processing units. The implementation is tested in two different NVIDIA hardware architectures, Tesla and Fermi, and the results are compared with a CPU-based parallel implementation using Open MPI.
Nov, 30

Architecture-Aware Algorithms and Software for Peta and Exascale Computing

Summary form only given. In this talk we examine how high performance computing has changed over the last 10-years and look toward the future in terms of trends. These changes have had and will continue to have a major impact on our software. Some of the software and algorithm challenges have already been encountered, such […]
Nov, 30

Support Operator Rupture Dynamics on GPU

The method of Support Operator (SOM) is a numerical method to simulate seismic wave propagation by solving the three dimension vsicoelastic equations. Its implementation, the Support Operator Rupture Dynamics (SORD) has been proved to be highly scalable in large-scale multi-processors calulations. This paper discusses accelarating SORD using on GPU using NVIDIA CUDA C. Compared to […]
Nov, 30

Using CUDA for Exhaustive Password Recovery

In the practical usage of cryptography, if one wish to decrypt some data without knowing the secret key that has been used for the encryption, one usually does not try to break the underlaying cryptographic construction, nor does one try all possible keys. What is mostly done is to try to find the password that […]
Nov, 30

GPU Accelerated Dissipative Particle Dynamics with Parallel Cell-list Updating

A general purpose DPD simulation entirely implemented on GPU is presented in this paper, including cell-list updating, force calculation and integrating forward. The algorithm and optimization needed to obtain best performance of GPU is discussed. The performance benchmarks show that our implementation running on single GPU can be more than 20x faster than conventional implementation […]
Nov, 30

Acceleration of computational quantum chemistry by heterogeneous computer architectures

Computational quantum chemistry mehods such as the Hartree-Fock (HF), the density functional theory (DFT) or the fragment molecular orbital (FMO) require heavy computational resources. In this study they are accelerated by using graphics processing units (GPUs) and the vector instruction set (AVX) of latest CPU. PRISM algorithm to evaluate the electron repulsion integrals was vectorized […]
Nov, 30

Optimization of the Particle-based Volume Rendering for GPUs with Hiding Data Transfer Latency

In this paper, we present the optimization of the particle-based volume rendering for GPU platforms. In general, data transfer between CPU and GPU accompanies long latency. Using page lock memory of the CUDA runtime API, data area is selected so that the data transfer between CPU and GPU becomes faster to reduce the execution time. […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: