3155

Posts

Feb, 27

A Method for Accelerating Bronchoscope Tracking Based on Image Registration by GPGPU

This paper presents an acceleration method for tracking a bronchoscope based on image registration. This method tracks a bronchoscope by image registration between real bronchoscopic images and virtual ones derived from CT images. However, since the computation cost of image registration, especially generating virtual bronchoscopic (VB) images, is quite expensive, it is dificult to track […]
Feb, 27

CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences

Biological sequence comparison is a very important operation in Bioinformatics. Even though there do exist exact methods to compare biological sequences, these methods are often neglected due to their quadratic time and space complexity. In order to accelerate these methods, many GPU algorithms were proposed in the literature. Nevertheless, all of them restrict the size […]
Feb, 27

Size Matters: Space/Time Tradeoffs to Improve GPGPU Applications Performance

GPUs offer drastically different performance characteristics compared to traditional multicore architectures. To explore the tradeoffs exposed by this difference, we refactor MUMmer, a widely-used, highly-engineered bioinformatics application which has both CPU- and GPU-based implementations. We synthesize our experience as three high-level guidelines to design efficient GPU-based applications. First, minimizing the communication overheads is as important […]
Feb, 27

Accelerating CUDA Graph Algorithms at Maximum Warp

Graphs are powerful data representations favored in many computational domains. Modern GPUs have recently shown promising results in accelerating computationally challenging graph problems but their performance suffers heavily when the graph structure is highly irregular, as most real-world graphs tend to be. In this study, we first observe that the poor performance is caused by […]
Feb, 27

A Domain-Specific Approach To Heterogeneous Parallelism

Exploiting heterogeneous parallel hardware currently requires mapping application code to multiple disparate programming models. Unfortunately, general-purpose programming models available today can yield high performance but are too low-level to be accessible to the average programmer. We propose leveraging domainspecific languages (DSLs) to map high-level application code to heterogeneous devices. To demonstrate the potential of this […]
Feb, 26

StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines

Multicore machines equipped with accelerators are becoming increasingly popular. The TOP500-leading RoadRunner machine is probably the most famous example of a parallel computer mixing IBM Cell Broadband Engines and AMD opteron processors. Other architectures, featuring GPU accelerators, are expected to appear in the near future. To fully tap into the potential of these hybrid machines, […]
Feb, 26

A Tuning Framework for Software-Managed Memory Hierarchies

New architectures are emerging at a rapid pace, architectures with multiple processing units on a chip and with deep memory hierarchies have become pervasive; while architectures with software-managed memory hierarchies (such as the Sony/Toshiba/IBM Cell processor) have gained popularity. Due to the increased complexity of architectures, re-targeting a legacy application to a new architecture requires […]
Feb, 26

Real-Time Approaches to Computer Vision

Perhaps the extensive reliance on our visual sensory inputs, makes the use of artificial visual sensors seem like an intuitive choice. Thus, Machine Vision or Computer Vision has become an exciting field of research, finding its way into many industrial applications. The results from Computer Vision research can be incorporated in autonomous machine navigation, industrial […]
Feb, 26

Believe it or Not! Multi-core CPUs Can Match GPU Performance for FLOP-intensive Application!

In this work, we evaluate performance of a real-world image processing application that uses a cross-correlation algorithm to compare a given image with a reference one. The algorithm processes individual images represented as 2-dimensional matrices of single-precision floating-point values using O(n4) operations involving dot-products and additions. We implement this algorithm on a nVidia GTX 285 […]
Feb, 26

GPU-Based Foreground-Background Segmentation Using an Extended Colinearity Criterion

We present a GPU-based foreground-background segmentation that processes image sequences in less than 4ms per frame. Change detection wrt. the background is based on a color similarity test in a small pixel neighbourhood, and is integrated into a Bayesian estimation framework. An iterative MRF-based model is applied, exploiting parallelism on modern graphics hardware. Resulting segmentation […]
Feb, 26

Concurrent GPU Programming

Monte Carlo algorithms use repeated random sampling to find solutions to problems. One common example uses points randomly selected from the unit box to approximate the value of pi. Another example is a simulation called a virtual spectrophotometer which measures the reflectance of a modeled material [1]. The repetitive nature of Monte Carlo algorithms usually […]
Feb, 26

Efficient GPU-Accelerated Elastic Image Registration

Elastic intra-patient registration can be used to correct for local motion within biomedical images. The application of elastic registration during interventional treatment is seriously hampered by its considerable computation time. The Graphics Processing Units (GPU) can be used to accelerate the calculation of such elastic registrations, without changing the basic registration algorithm. This article discusses […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: