10901

Posts

Nov, 10

Performance Analysis of Sobel Edge Detection Filter on GPU using CUDA & OpenGL

CUDA(Compute Unified Device Architecture) is a novel technology of general-purpose computing on the GPU, which makes users develop general GPU (Graphics Processing Unit) programs easily. GPUs are emerging as platform of choice for Parallel High Performance Computing. GPUs are good at data intensive parallel processing with availability of software development platforms such as CUDA (developed […]
Nov, 10

Implementation of Spectral Angle Mapper (SAM) Algorithm on a Graphic processing unit (GPU)

The Need for Hyper spectral Images for Exploration of Oil and Other Minerals are so massive. We can tap the high computational power available now for faster tracking of those minerals underneath. In this paper, we Implement an Algorithm called Spectral angle mapper(SAM) using compute unified device architecture(CUDA) framework on a GPU. The SAM algorithm […]
Nov, 10

Towards a Portable and Future-proof Particle-in-Cell Plasma Physics Code

We present the first reported OpenCL implementation of EPOCH3D, an extensible particle-in-cell plasma physics code developed at the University of Warwick. We document the challenges and successes of this porting effort, and compare the performance of our implementation executing on a wide variety of hardware from multiple vendors. The focus of our work is on […]
Nov, 8

Tiled QR Decomposition and Its Optimization on CPU and GPU Computing System

There can be many types of heterogeneous computing systems, and the most useful one is the CPU and GPU computing system. In this system, we try to run QR decomposition, which expresses a standard real matrix as a production of two matrices. For a tiled QR decomposition algorithm, which is a parallelized version of QR […]
Nov, 8

GPU-Based Space-Time Adaptive Processing (STAP) for Radar

Space-time adaptive processing (STAP) utilizes a two-dimensional adaptive filter to detect targets within a radar data set with speeds similar to the background clutter. While adaptively optimal solutions exist, they are prohibitively computationally intensive. Thus, researchers have developed alternative algorithms with nearly optimal filtering performance and greatly reduced computational intensity. While such alternatives reduce the […]
Nov, 8

Accelerating a Novel Particle-based Fluid Simulation on the GPU

Stochastic Rotation Dynamics (SRD) is a novel particle-based simulation method that can be used to model complex fluids [1], [2], such as binary and ternary mixtures [3], and polymer solutions [4]-[6], in either two or three dimensions. Although SRD is efficient compared to traditional methods, it is still computationally expensive for large system sizes, e.g. […]
Nov, 8

Architectural improvements and 28 nm FPGA implementation of the APEnet+ 3D Torus network for hybrid HPC systems

Modern Graphics Processing Units (GPUs) are now considered accelerators for general purpose computation. A tight interaction between the GPU and the interconnection network is the strategy to express the full potential on capability computing of a multi-GPU system on large HPC clusters; that is the reason why an efficient and scalable interconnect is a key […]
Nov, 8

GooFit: A library for massively parallelising maximum-likelihood fits

Fitting complicated models to large datasets is a bottleneck of many analyses. We present GooFit, a library and tool for constructing arbitrarily-complex probability density functions (PDFs) to be evaluated on nVidia GPUs or on multicore CPUs using OpenMP. The massive parallelisation of dividing up event calculations between hundreds of processors can achieve speedups of factors […]
Nov, 8

Moving Least-Squares Reconstruction of Large Models with GPUs

Modern laser range scanning campaigns produce extremely large point clouds, and reconstructing a triangulated surface thus requires both out-of-core techniques and significant computational power. We present a GPU-accelerated implementation of the Moving Least Squares (MLS) surface reconstruction technique. While several previous out-of-core approaches use a sweep-plane approach, we subdivide the space into cubic regions that […]
Nov, 8

Computational kinetics of a large scale biological process on GPU workstations: DNA bending

It has only recently become possible to study the dynamics of large time scale biological processes computationally in explicit solvent and atomic detail. This required a combination of advances in computer hardware, utilization of parallel and special purpose hardware as well as numerical and theoretical approaches. In this work we report advances in these areas […]
Nov, 8

Discrete Shearlet Transform on GPU with Applications in Anomaly Detection and Denoising

Shearlets have emerged in recent years as one of of the most successful methods for the multiscale analysis of multidimensional signals. Unlike wavelets, shearlets form a pyramid of well-localized functions defined not only over a range of scales and locations, but also over a range of orientations and with highly anisotropic supports. As a result, […]
Nov, 8

Automatic Synthesis of Heterogeneous CPU-GPU Embedded Applications from a UML Profile

Modern embedded systems present an ever increasing complexity and model-driven engineering has been shown to be helpful in mitigating it. In our previous works we exploited the power of model-driven engineering to develop a round-trip approach for aiding the evaluation and assessment of extra-functional properties preservation from models to code. In addition, we showed how […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: