Posts
Apr, 5
A GPGPU-Based Collision Detection Algorithm
A GPGPU-based collision detection algorithm is proposed. Firstly, the information of OBB hierarchy tree and triangles of tested objects are mapped into some data textures designed for GPGPU-based calculation, such as triangle vertex textures, bounding box size texture, tree node relationship texture, etc., then these textures are downloaded to GPU to complete the data preparation. […]
Apr, 5
GPGPU supported cooperative acceleration in molecular dynamics
Molecular dynamics simulations have become a significant computational approach to study complicated physical phenomena at the atomic level. Nevertheless, accurate simulations are limited in size and timescale by the available computing resources, which make the simulations very time-consuming. This consequentially leads to tremendous computational requirements. Therefore, the need for speeding up this process is crucial. […]
Apr, 5
Parallelizing Simulated Annealing-Based Placement Using GPGPU
Simulated annealing has became the de facto standard for FPGA placement engines since it provides high quality solutions and is robust under a wide range of objective functions. However, this method will soon become prohibitive due to its sequential nature and since the performance of single-core processor has stagnated. General purpose computing on graphics processing […]
Apr, 5
GPGPU-FDTD method for 2-dimensional electromagnetic field simulation and its estimation
For signal/power integrity analysis of the high density packages and printed circuit boards, the FDTD (Finite-Difference Time-Domain) method has been widely used. In order to apply to large-scale problems, a variety of acceleration techniques are required. This paper describes a GPGPU-FDTD (General Purpose computing on GPU (Graphic Processing Unit)-Finite-Difference Time-Domain) method for massively parallel electromagnetic […]
Apr, 4
A Case Study of SWIM: Optimization of Memory Intensive Application on GPGPU
Recently, GPGPU has been adopted well in the High Performance Computing (HPC) field. The limited global memory bandwidth poses a great challenge to many GPGPU programmers trying to exploit parallelism within the CPU-GPU heterogeneous platform. In this paper, we choose SWIM, a typical memory intensive application from the SPEC OMP 2001 benchmark suite, for case […]
Apr, 4
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications
We consider the problem of how to improve memory latency tolerance in massively multithreaded GPGPUs when the thread-level parallelism of an application is not sufficient to hide memory latency. One solution used in conventional CPU systems is prefetching, both in hardware and software. However, we show that straightforwardly applying such mechanisms to GPGPU systems does […]
Apr, 4
GPGPU implementation of a synaptically optimized, anatomically accurate spiking network simulator
Simulation of biological spiking networks is becoming more relevant in understanding neuronal processes. An increasing proportion of these simulations focuses on large scale modeling efforts. Unfortunately the size of large networks is often limited by both computational power and memory. Computational power constrains both the maximum number of differential equations and the maximum number of […]
Apr, 4
GPGPU-based Latency Insertion Method: Application to PDN simulations
With the progress of high-density integration technology of the circuits, a variety of signal and power integrity problems have become serious and important for the electronic design. This paper describes the fast circuit simulation by GPGPU-LIM (GPGPU-based Latency Insertion Method). First, LIM is reviewed, which is a fast algorithm. Next, implementation of LIM on the […]
Apr, 4
Migrating real-time depth image-based rendering from traditional to next-gen GPGPU
This paper focuses on the current revolution in using the GPU for general-purpose computations (GPGPU), and how to maximally exploit its powerful resources. Recently, the advent of next-generation GPGPU replaced the traditional way of exploiting the graphics hardware. We have migrated real-time depth image-based rendering – for use in contemporary 3DTV technology – and noticed […]
Apr, 4
The optimization of parallel Smith-Waterman sequence alignment using on-chip memory of GPGPU
Memory optimization is an important strategy to gain high performance for sequence alignment implemented by CUDA on GPGPU. Smith-Waterman (SW) algorithm is the most sensitive algorithm widely used for local sequence alignment but very time consuming. Although several parallel methods have been used in some studies and shown good performances, advantages of GPGPU memory hierarchy […]
Apr, 4
Parallel connected-component labeling algorithm for GPGPU applications
This paper proposes a new connected component labeling algorithm for GPGPU applications based on NVIDIA’s CUDA. Various approaches and algorithms for connected component labeling with minimal execution time were designed, but the most of them have been focused on optimizing CPU algorithm. Therefore it is hard to apply these approaches to GPGPU programming models such […]
Apr, 4
Exploring GPGPU workloads: Characterization methodology, analysis and microarchitecture evaluation implications
The GPUs are emerging as a general-purpose high-performance computing device. Growing GPGPU research has made numerous GPGPU workloads available. However, a systematic approach to characterize these benchmarks and analyze their implication on GPU microarchitecture design evaluation is still lacking. In this research, we propose a set of microarchitecture agnostic GPGPU workload characteristics to represent them […]