12790

Posts

Sep, 8

Document Classification Using KNN on GPU

Real-time and archrival data documents are increases as fast as or faster than computing power now a days. Document classification using k-nn classification algorithm takes more time in searching nearer neighbors in large training dataset, it include large number of computations. The time for classification increases in proportion to the number of documents. Therefore it […]
Sep, 8

Sparse array representations and some selected array operations on GPUs

A multi-dimensional data model provides a good conceptual view of the data in data warehousing and On-Line Analytical Processing (OLAP). A typical representation of such a data model is as a multi-dimensional array which is well suited when the array is dense. If the array is sparse, i.e., has a few number of non-zero elements […]
Sep, 8

A CUDA Back-End for the Equelle Compiler

As parallel and heterogeneous computing becomes more and more a necessity for implementing high performance simulators, it becomes increasingly harder for scientists and engineers without experience in high performance computing to achieve good performance. Even for those who knows how to write efficient code the process for doing so is time consuming and error prone, […]
Sep, 8

Accelerated Combinatorial Optimization using Graphics Processing Units and C++ AMP

In the course of less than a decade, Graphics Processing Units (GPUs) have evolved from narrowly scoped application specific accelerators to general-purpose parallel machines capable of accommodating an ever-growing set of algorithms. At the same time, programming GPUs appears to have become trapped around an attractor characterised by ad-hoc practices, non-portable implementations and inexact, uninformative […]
Sep, 5

Enhancing Efficiency of the RRTMG Radiation Code with GPU and MIC Approaches for Numerical Weather Prediction Models

Radiative transfer (RT) calculations are among the most computationally expensive components of global and regional weather and climate models, and radiation codes are therefore ideal candidates for applying techniques to improve the overall efficiency of such models. In many general circulation models (GCMs), a physically based radiation calculation can require as much as 30-50 percent […]
Sep, 5

Acquisition Method of Spread Spectrum Signals Based on GPU Acceleration

To meet the strict requirements of acquisition time under conditions of high dynamic and low CNR, an acquisition method of spread spectrum signals based on GPU acceleration has put forward through the combination of the spread spectrum signal acquisition and GPU parallel computing. Taking the frequency domain parallel acquisition algorithm based on FFT as the […]
Sep, 5

Dymaxion++: A Directive-based API to Optimize Data Layout and Memory Mapping for Heterogeneous Systems

There has been a growing trend in using heterogeneous systems with CPUs and GPUs to solve diverse compute problems. However, high application performance on these platforms relies on efficient memory accesses. For many applications, CPUs and GPUs prefer different memory mappings and data-structure layouts. This in turn requires developers to use device-specific strategies for memory […]
Sep, 5

Research on double negative materials by using FDTD method based on GPUs

In recent years, the finite difference time domain (FDTD) method has been prevailed in the simulation of metamaterials widely. As the FDTD method can be suitable for the parallel computing, we apply this method to the Fermi-architecture Graphic Process Units (GPUs) to calculate the electromagnetic simulation of double negative materials in this paper. Finally, both […]
Sep, 5

HISQ inverter on Intel Xeon Phi and NVIDIA GPUs

The runtime of a Lattice QCD simulation is dominated by a small kernel, which calculates the product of a vector by a sparse matrix known as the "Dslash" operator. Therefore, this kernel is frequently optimized for various HPC architectures. In this contribution we compare the performance of the Intel Xeon Phi to current Kepler-based NVIDIA […]
Sep, 4

Analysis of KECCAK Tree Hashing on GPU Architectures

In an effort to provide security and data integrity, hashing algorithms have been designed to consume an input of any length to produce a fixed length output. KECCAK was selected by NIST to become the next Secure Hashing Algorithm SHA-3) after nearly five years of competition. In addition to providing a sequential operating mode, there […]
Sep, 4

Acceleration of stereo-matching on multi-core CPU and GPU

This paper presents an accelerated version of a dense stereo-correspondence algorithm for two different parallelism enabled architectures, multi-core CPU and GPU. The algorithm is part of the vision system developed for a binocular robot-head in the context of the CloPeMa 1 research project. This research project focuses on the conception of a new clothes folding […]
Sep, 4

A uniform approach for programming distributed heterogeneous computing systems

Large-scale compute clusters of heterogeneous nodes equipped with multi-core CPUs and GPUs are getting increasingly popular in the scientific community. However, such systems require a combination of different programming paradigms making application development very challenging. In this article we introduce libWater, a library-based extension of the OpenCL programming model that simplifies the development of heterogeneous […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: