12532

Posts

Jul, 14

On Development, Feasibility, and Limits of Highly Efficient CPU and GPU Programs in Several Fields

With processor clock speeds having stagnated, parallel computing architectures have achieved a breakthrough in recent years. Emerging many-core processors like graphics cards run hundreds of threads in parallel and vector instructions are experiencing a revival. Parallel processors with many independent but simple arithmetical logical units fail executing serial tasks efficiently. However, their sheer parallel processing […]
Jul, 14

A Fine Grained Cycle Sharing System with Cooperative Multitasking on GPUs

The emergence of compute unified device architecture (CUDA), which has relieved application developers from having to understand complex graphics pipelines, has made the graphics processing unit (GPU) useful not only for graphics applications but also for general applications. In this paper, we present a cycle sharing system named GPU grid, which exploits idle GPU cycles […]
Jul, 14

GPU Techniques Applied to Euler Flow Simulations and Comparison to CPU Performance

With the decrease in cost of computing, and the increasingly friendly programming environments, the demand for computer generated models of real world problems has surged. Each generation of computer hardware becomes marginally faster than its predecessor, allowing for decreases in required computation time. However, the progression is slowing and will soon reach a barrier as […]
Jul, 14

Infiniband-Verbs on GPU: A case study of controlling an Infiniband network device from the GPU

Due to their massive parallelism and high performance per watt GPUs gain high popularity in high performance computing and are a strong candidate for future exacscale systems. But communication and data transfer in GPU accelerated systems remain a challenging problem. Since the GPU normally is not able to control a network device, today a hybrid-programming […]
Jul, 14

Benchmarking the Memory Hierarchy of Modern GPUs

Memory access efficiency is a key factor for fully exploiting the computational power of Graphics Processing Units (GPUs). However, many details of the GPU memory hierarchy are not released by the vendors. We propose a novel fine-grained benchmarking approach and apply it on two popular GPUs, namely Fermi and Kepler, to expose the previously unknown […]
Jul, 12

Parallel Implementations for Solving Shortest Path Problem using Bellman-Ford

In this paper, different parallel implementations of Bellman-Ford algorithm on GPU using OpenCL are presented. These variants include Bellman-Ford for solving single source shortest path (SSSP) having two variants and Bellman-Ford for all pair shortest path (APSP) problems. Also, a comparative analysis of their performances on CPU and GPU is discussed in this paper.Write-write consistency […]
Jul, 12

SPH Fluids for Viscous Jet Buckling

We present a novel meshfree technique for animating free surface viscous liquids with jet buckling effects, such as coiling and folding. Our technique is based on Smoothed Particle Hydrodynamics (SPH) fluids and allows more realistic and complex viscous behaviors than the preceding SPH frameworks in computer animation literature. The viscous liquid is modeled by a […]
Jul, 12

Collision Detection: Broad Phase Adaptation from Multi-Core to Multi-GPU Architecture

We have presented several contributions on the collision detection optimization centered on hardware performance. We focus on the first step (Broad-phase) and propose three new ways of parallelization of the well-known Sweep and Prune algorithm. We first developed a multi-core model takes into account the number of available cores. Multi-core architecture enables us to distribute […]
Jul, 12

Parallelized Hierarchical Expected Matching Probability for Multiple Sequence Alignment

Sequence alignment of two or more than two biological sequences such as protein, DNA (Deoxyribonucleic acid) or RNA (Ribonucleic acid) is called MSA (Multiple Sequence Alignment). Sequence homology can be inferred from the resulting MSA. Existing System uses dynamic programming technique which suffers from exponential growth of time as the sequence grows. A Hierarchical Expected […]
Jul, 12

Using the GPU for Fast Symmetry-Based Dense Stereo Matching in High Resolution Images

SymStereo is a new algorithm used for stereo estimation. Instead of measuring photo-similarity, it proposes novel cost functions that measure symmetry for evaluating the likelihood of two pixels being a match. In this work we propose a parallel approach of the LogN matching cost variant of SymStereo capable of processing pairs of images in real-time […]
Jul, 11

Combining Data Parallelism and Task Parallelism for Efficient Performance on Hybrid CPU and GPU Systems

In earlier times, computer systems had only a single core or processor. In these computers, the number of transistors on-chip (i.e. on the processor) doubled every two years and all applications enjoyed free speedup. Subsequently, with more and more transistors being packed on-chip, power consumption became an issue, frequency scaling reached its limits and industry […]
Jul, 11

Programming-Model Centric Debugging for Multicore Embedded Systems

In this thesis, we propose to study interactive debugging of applications running on embedded systems Multi-Processor System on Chip (MPSoC). A literature study showed that nowadays, the design and development of these applications rely more and more on programming models and development frameworks. These environments gather established algorithmic and programming good-practices, and hence speed up […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: