8551

Posts

Nov, 8

Reusable OpenCL FPGA Infrastructure

OpenCL has emerged as a standard programming model for heterogeneous systems. Recent work combining OpenCL and FPGAs has focused on high-level synthesis. Building a complete OpenCL FPGA system requires more than just high-level synthesis. This work introduces a reusable OpenCL infrastructure for FPGAs that complements previous work and specifically targets a key architectural element – […]
Nov, 8

SPH Based Fluid Animation Using CUDA Enabled GPU

Realistic Fluid Animation is an inherent part of special effects in Film and Gaming Industry. These animations are created through the simulation of highly compute intensive fluid model. The computations involved in execution of fluid model emphasize the need of high performance parallel system to achieve the real time animation. This paper primarily devoted to […]
Nov, 8

Massively parallel Monte Carlo for many-particle simulations on GPUs

Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance […]
Nov, 7

Efficient implementation of data flow graphs on multi-gpu clusters

Nowadays, it is possible to build a multi-GPU supercomputer, well suited for implementation of digital signal processing algorithms, for a few thousand dollars. However, to achieve the highest performance with this kind of architecture, the programmer has to focus on inter-processor communications, tasks synchronization. In this paper, we propose a high level programming model based […]
Nov, 7

Study of Convolution Algorithms using CPU and Graphics Hardware

In this thesis we evaluate different two-dimensional image convolution algorithms using Fast Fourier Transform (FFT) libraries on the CPU and on the graphics hardware, using Compute Unified Device Architecture (CUDA). The final product is used in VISSLA (VISualisation tool for Simulation of Light scattering and Aberrations), a software written in Matlab. VISSLA is used to […]
Nov, 7

A GPU-Based Transient Stability Simulation Using Runge-Kutta Integration Algorithm

Graphics processing units (GPU) have been investigated to release the computational capability in various scientific applications. Recent research shows that prudential consideration needs to be given to take the advantages of GPUs while avoiding the deficiency. In this paper, the impact of GPU acceleration to implicit integrators and explicit integrators in transient stability is investigated. […]
Nov, 7

GPU Virtualization

In modern computing, the Graphical Processing Unit (GPU) has proven its worth beyond that of graphics rendering. Its usage is extended into the field of general purpose computing, where applications exploit the GPU’s massive parallelism to accelerate their tasks. Meanwhile, Virtual Machines (VM) continue to provide utility and security by emulating entire computer hardware platforms […]
Nov, 7

Connectivity-Based Segmentation for GPU-Accelerated Mesh Decompression

We present a novel algorithm to partition large 3D meshes for GPU-accelerated decompression. Our formulation focuses on minimizing the replicated vertices between patches, and balancing the numbers of faces of patches for efficient parallel computing. First we generate a topology model of the original mesh and remove vertex positions. Then we assign the centers of […]
Nov, 6

Adapting Irregular Computations to Large CPU-GPU Clusters in the MADNESS Framework

Graphics Processing Units (GPUs) are becoming the workhorse of scalable computations. MADNESS is a scientific framework used especially for computational chemistry. Most MADNESS applications use operators that involve many small tensor computations, resulting in a less regular organization of computations on GPUs. A single GPU kernel may have to multiply by hundreds of small square […]
Nov, 6

A Framework for Automated Generation of Specialized Function Variants

Efficient large-scale scientific computing requires efficient code, yet optimizing code to render it efficient simultaneously renders the code less readable, less maintainable, less portable, and requires detailed knowledge of low-level computer architecture, which the developers of scientific applications may lack. The necessary knowledge is subject to change over time as new architectures, such as GPGPU […]
Nov, 6

All-Pairs Shortest Path Algorithms Using CUDA

Utilising graph theory is a common activity in computer science. Algorithms that perform computations on large graphs are not always cost effective, requiring supercomputers to achieve results in a practical amount of time. Graphics Processing Units provide a cost effective alternative to supercomputers, allowing parallel algorithms to be executed directly on the Graphics Processing Unit. […]
Nov, 6

Design and Development of an Efficient H. 264 Video Encoder for CPU/GPU using OpenCL

Video codecs have undergone dramatic improvements and increased in complexity over the years owing to various commercial products like mobiles and Tablet PCs. With the emergence of standards, such H.264 which has emerged as the de facto standard for video, uniformity in the delivery of video is observed. With constraints of memory and transmission bandwidth, […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: