12786

Posts

Sep, 5

Enhancing Efficiency of the RRTMG Radiation Code with GPU and MIC Approaches for Numerical Weather Prediction Models

Radiative transfer (RT) calculations are among the most computationally expensive components of global and regional weather and climate models, and radiation codes are therefore ideal candidates for applying techniques to improve the overall efficiency of such models. In many general circulation models (GCMs), a physically based radiation calculation can require as much as 30-50 percent […]
Sep, 5

Acquisition Method of Spread Spectrum Signals Based on GPU Acceleration

To meet the strict requirements of acquisition time under conditions of high dynamic and low CNR, an acquisition method of spread spectrum signals based on GPU acceleration has put forward through the combination of the spread spectrum signal acquisition and GPU parallel computing. Taking the frequency domain parallel acquisition algorithm based on FFT as the […]
Sep, 5

Dymaxion++: A Directive-based API to Optimize Data Layout and Memory Mapping for Heterogeneous Systems

There has been a growing trend in using heterogeneous systems with CPUs and GPUs to solve diverse compute problems. However, high application performance on these platforms relies on efficient memory accesses. For many applications, CPUs and GPUs prefer different memory mappings and data-structure layouts. This in turn requires developers to use device-specific strategies for memory […]
Sep, 5

Research on double negative materials by using FDTD method based on GPUs

In recent years, the finite difference time domain (FDTD) method has been prevailed in the simulation of metamaterials widely. As the FDTD method can be suitable for the parallel computing, we apply this method to the Fermi-architecture Graphic Process Units (GPUs) to calculate the electromagnetic simulation of double negative materials in this paper. Finally, both […]
Sep, 5

HISQ inverter on Intel Xeon Phi and NVIDIA GPUs

The runtime of a Lattice QCD simulation is dominated by a small kernel, which calculates the product of a vector by a sparse matrix known as the "Dslash" operator. Therefore, this kernel is frequently optimized for various HPC architectures. In this contribution we compare the performance of the Intel Xeon Phi to current Kepler-based NVIDIA […]
Sep, 4

Analysis of KECCAK Tree Hashing on GPU Architectures

In an effort to provide security and data integrity, hashing algorithms have been designed to consume an input of any length to produce a fixed length output. KECCAK was selected by NIST to become the next Secure Hashing Algorithm SHA-3) after nearly five years of competition. In addition to providing a sequential operating mode, there […]
Sep, 4

Acceleration of stereo-matching on multi-core CPU and GPU

This paper presents an accelerated version of a dense stereo-correspondence algorithm for two different parallelism enabled architectures, multi-core CPU and GPU. The algorithm is part of the vision system developed for a binocular robot-head in the context of the CloPeMa 1 research project. This research project focuses on the conception of a new clothes folding […]
Sep, 4

A uniform approach for programming distributed heterogeneous computing systems

Large-scale compute clusters of heterogeneous nodes equipped with multi-core CPUs and GPUs are getting increasingly popular in the scientific community. However, such systems require a combination of different programming paradigms making application development very challenging. In this article we introduce libWater, a library-based extension of the OpenCL programming model that simplifies the development of heterogeneous […]
Sep, 4

Solving 3D incompressible Navier-Stokes equations on hybrid CPU/GPU systems

This paper describes a hybrid multicore/GPU solver for the incompressible Navier-Stokes equations with constant coefficients, discretized by the finite difference method. By applying the prediction-projection method, the Navier-Stokes equations are transformed into a combination of Helmholtzlike and Poisson equations for which we describe efficient solvers. As an extension of our previous paper [1], this paper […]
Sep, 4

CUDA method for the FDTD simulation by GPU

The technology of computational devices has been developed over several decades especially graphic processors which not only deal with graphic works but also compute scientific problems. This processor is suitable for parallel computations instead of using expensive high-end devices. Many research groups have implemented parallel computations using the MPI method with multi CPUs to solve […]
Sep, 3

New efficient integral algorithms for quantum chemistry

The contents of this thesis are centered in the developement of new efficient algorithms for molecular integral evaluation in quantum chemistry, as well as new design and implementation strategies for such algorithms aimed at maximizing their performance and the utilization of modern hardware. This thesis introduces the K4+MIRROR algorithm for 2-electron repulsion integrals, a new […]
Sep, 3

GPU-accelerated Database Systems: Survey and Open Challenges

The vast amount of processing power and memory bandwidth provided by modern graphics cards make them an interesting platform for data-intensive applications. Unsurprisingly, the database research community identified GPUs as effective co-processors for data processing several years ago. In the past years, there were many approaches to make use of GPUs at different levels of […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: