6951

Posts

Jan, 9

Gauge Fixing in Lattice QCD on GPUs

Quantum Chromodynamics (QCD) [1, 2] is the theory of the strong interaction which is responsible for the hadron spectrum and therefore for all matter in our everyday life. QCD, being a quantum field theory and part of the standard model of elementary particles, describes the interactions between color-charged quarks and gluons. Hadrons, e.g., protons, neutrons […]
Jan, 9

A new parallel tool for classification of remotely sensed imagery

In this paper, we describe a new tool for classification of remotely sensed images. Our processing chain is based on three main parts: (1) pre-processing, performed using morphological profiles which model both the spatial (high resolution) and the spectral (color) information available from the scenes; (2) classification, which can be performed in unsupervised fashion using […]
Jan, 9

Top-k Queries Processing With Uncertain Data on Graphics Processing Units

Considering the complex uncertain database, top-k query processing in uncertain databases is semantically and computationally different from classical top-k processing. Score is not the only factor we should concern. The interplay between score and membership uncertainty makes computation complex. Powerful computing capability of Graphic Processing Unit(GPU) is needed in the processing of this kind of […]
Jan, 9

Designing Numerical Solvers for Next Generation High Performance Computing

High Performance Computing (HPC) is moving towards massive scales of parallelism. The changes in hardware towards large scale on chip parallelism requires the re-writing of existing solvers for various Computational Fluid Dynamics (CFD) problems. The aim of the project is to write and optimise novel solvers for various common CFD numerical problems that can take […]
Jan, 9

LU Factorization for Accelerator-based Systems

Multicore architectures enhanced with multiple GPUs are likely to become mainstream High Performance Computing (HPC) platforms in a near future. In this paper, we present the design and implementation of an LU factorization using tile algorithm that can fully exploit the potential of such platforms in spite of their complexity. We use a methodology derived […]
Jan, 9

Neural Network Simulation: The recognition application

This paper presents the GPU mapping of the recognition algorithm of a Convolution Neural Network (CNN). This work is based on a C-implementation of the application. The mapping to GPU was performed through different approaches which are explained in detail. The improvements achieved by each approach are presented as well as the overall speed up […]
Jan, 9

Spatial Sorting Algorithms for Parallel Computing in Networks

Many basic techniques in computer science have been founded on the assumption that physical computing resources are scarce but orderly, and that the cost of effective direct communication between physically distant parts of a computer system is affordable. In large scale cluster computing installations, fine-grained parallel computing hardware, or wireless mesh networks, these familiar assumptions […]
Jan, 9

High Performance and Scalable GPU Graph Traversal

Breadth-first search (BFS) is a core primitive for graph traversal and a basis for many higher-level graph analysis algorithms. It is also representative of a class of parallel computations whose memory accesses and work distribution are both irregular and data-dependent. Recent work has demonstrated the plausibility of GPU sparse graph traversal, but has tended to […]
Jan, 9

Fast GPU-based Locality Sensitive Hashing for K-Nearest Neighbor Computation

We present an efficient GPU-based parallel LSH algorithm to perform approximate k-nearest neighbor computation in high-dimensional spaces. We use the Bi-level LSH algorithm, which can compute k-nearest neighbors with higher accuracy and is amenable to parallelization. During the first level, we use the parallel RP-tree algorithm to partition datasets into several groups so that items […]
Jan, 8

A class of communication-avoiding algorithms for solving general dense linear systems on CPU/GPU parallel machines

We study several solvers for the solution of general linear systems where the main objective is to reduce the communication overhead due to pivoting. We first describe two existing algorithms for the LU factorization on hybrid CPU/GPU architectures. The first one is based on partial pivoting and the second uses a random preconditioning of the […]
Jan, 8

Makespan computation for GPU threads running on a single streaming multiprocessor

Graphics processors were originally developed for rendering graphics but have recently evolved towards being an architecture for general-purpose computations. They are also expected to become important parts of embedded systems hardware – not just for graphics. However, this necessitates the development of appropriate timing analysis techniques which would be required because techniques developed for CPU […]
Jan, 8

Hybrid Algorithms for List Ranking and Graph Connected Components

The advent of multicore and many-core architectures saw them being deployed to speed-up computations across several disciplines and application areas. Prominent examples include semi-numerical algorithms such as sorting, graph algorithms, image processing, scientific computations, and the like. In particular, using GPUs for general purpose computations has attracted a lot of attention given that GPUs can […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org