8699

Posts

Dec, 8

A Computationally Efficient Parallel Kernel Regression for Image Reconstruction

Image reconstruction is a method by which the underlying images, hidden in blurry and noisy data, can be retrieved. This is used in applications such as computer tomography (CT), magnetic resonance and radio astronomy. In recent times, a non-parametric adaptive regression method called steering kernel regression was proposed and proved to be effective. This method […]
Dec, 8

A computationally efficient and scalable approach for privacy preserving kNN classification

In the modern age, there is a great desire to mine users’ personal data from varied sources, to discover their behaviours. However, due to the growing awareness among the organizations regarding the privacy of user data and the strict privacy regulations of government, there is a growing resistance to share data directly with others. Encryption […]
Dec, 8

GPGPU-Aided 3D Staggered-grid Finite-difference Seismic Wave Modeling

Finite difference is a simple, fast and effective numerical method for seismic wave modeling, and has been widely used in forward waveform inversion and reverse time migration. However, intensive calculation of three-dimensional seismic forward modeling has been restricting the industrial application of 3D pre-stack reverse time migration and inversion. Aiming at this problem, in this […]
Dec, 8

GPU accelerating the FEniCS Project

In the recent years, the graphics processing unit (GPU) has emerged as a popular platform for performing general purpose calculations. The high computational power of the GPU has made it an attractive accelerator for achieving increased performance of computer programs. Although GPU programming has become more tangible over the years, it is still challenging to […]
Dec, 8

Lightweight Modular Staging and Embedded Compilers: Abstraction Without Regret for High-Level High-Performance Programming

Programs expressed in a high-level programming language need to be translated to a low-level machine dialect for execution. This translation is usually accomplished by a compiler, which is able to translate any legal program to equivalent low-level code. But for individual source programs, automatic translation does not always deliver good results: Software engineering practice demands […]
Dec, 5

A Data-Parallel Graphics Pipeline Implemented in OpenCL

This report documents implementation details, results, benchmarks and technical discussions for the work carried out within a master’s thesis at Linkoping University. Within the master’s thesis, the field of software rendering is explored in the age of parallel computing. Using the Open Computing Language, a complete graphics pipeline was implemented for use on general processing […]
Dec, 5

Mapping Streaming Applications to OpenCL

Graphic processing units (GPUs) have been gaining popularity in general purpose and high performance computing. A GPU is made up of a number of streaming multiprocessors (SM), each of which consists of many processing cores. A large number of general-purpose applications have been mapped onto GPUs efficiently. Stream processing applications, however, exhibit properties such as […]
Dec, 5

Parallel Cosegmentation via Submodular Optimization on Anisotropic Diffusion

With large number of related images being used for applications such as MR spectroscopy imaging, Object of interest 3D modelling and photo collages, the need of the hour is to accelerate image cosegmentation algorithms. Cosegmentation refers to the process of segmenting common regions from multiple related images. A novel distributed algorithm, CoSand [1], for cosegmentation […]
Dec, 5

Gauge Field Generation on Large-Scale GPU-Enabled Systems

Over the past years GPUs have been successfully applied to the task of inverting the fermion matrix in lattice QCD calculations. Even strong scaling to capability-level supercomputers, corresponding to O(100) GPUs or more has been achieved. However strong scaling a whole gauge field generation algorithm to this regim requires significantly more functionality than just having […]
Dec, 5

Usage of GPU in LS-DYNA

The increasing computing power of GPUs can be used to improve the performance of CAE systems.[1]. Within LS-DYNA an improved direct equation solver can be used, which accelerates the performance of implicit applications by use of a CUDA-based solver [2], [3], [4]. In this paper the performance improvements for different customer input decks for metal […]
Dec, 4

Fast Parallel Sorting Algorithms on GPUs

This paper presents a comparative analysis of the three widely used parallel sorting algorithms: OddEven sort, Rank sort and Bitonic sort in terms of sorting rate, sorting time and speed-up on CPU and different GPU architectures. Alongside we have implemented novel parallel algorithm: min-max butterfly network, for finding minimum and maximum in large data sets. […]
Dec, 4

gR: A GPU-based Router

With the growing internet traffic and complexity of packet processing task, the throughput of routers is affected. Also modern routers need to provide additional services like security, QOS which further adds to the complexity. These issues can be addressed with the massive parallel computing capability of graphic processors. In this paper, we offload two of […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: