6793

Posts

Dec, 25

Design and Development of Optical Flow Based Obstacle Avoidance Using CUDA

Autonomous vehicles and robots that navigate based on camera input need techniques for obstacle avoidance for successful navigation. Optical Flow based obstacle avoidance is one of the most popular techniques. Real-time motion estimation remains a challenge because of its high computational expense. The traditional CPU-based schemes satisfy the power, size and computation requirements. Graphical Processing […]
Dec, 25

GPU-Parallel Implementation of Color based Medical Image Retrieval in Compressed Domain

In huge databases Image processing takes more time for execution on a single core processor because of slow single thread algorithms. Graphics Processing Unit (GPU) is more popular now-a-days due to their speed, programmability, low cost and more inbuilt execution cores in it. Most of the researchers started work to use GPUs as a processing […]
Dec, 25

Parallel Implementation of Shape based Image Retrieval Approach on CUDA in Compressed Domain

Fast and accurate algorithms are necessary for Content based image retrieval (CBIR) systems to perform operations on compressed images databases such as jpeg or through compressive sensing. Feature extraction and feature matching are two important steps in any CBIR system. Wrong matching may affect the accuracy rate of CBIR systems. The matching of query image […]
Dec, 25

A GPU-supported High-Level Programming Language for Image Processing

Real-time image/video processing applications are now in demand with the advance of general purpose computers and mobile devices. However, programmers have to handle the digital images, and be aware of the resolutions and pixels. This makes image processing programming unintuitive. On the other hand, image/video processing typically has data parallelisms, and the performance gains are […]
Dec, 24

Toolchain for programming, simulating and studying the XMT many-core architecture

The Explicit Multi-Threading (XMT) is a general-purpose many-core computing platform, with the vision of a 1000-core chip that is easy to program but does not compromise on performance. This paper presents a publicly available tool chain for XMT, complete with a highly configurable cycle-accurate simulator and an optimizing compiler. The XMT tool chain has matured […]
Dec, 24

High Performance and Scalable Radix Sorting: A case study of implementing dynamic parallelism for GPU computing

The need to rank and order data is pervasive, and many algorithms are fundamentally dependent upon sorting and partitioning operations. Prior to this work, GPU stream processors have been perceived as challenging targets for problems with dynamic and global data-dependences such as sorting. This paper presents: (1) a family of very efficient parallel algorithms for […]
Dec, 24

Efficient Use of In-Game Ray-Tracing Techniques

Ray-tracing is a computational demanding image generation technique capable of create photo-realistic images. Due to its high demand of computational power, ray-tracing is not used for realtime applications, however, with the massive parallel capabilities of current Graphics Processing Units, and the fact that ray tracing is a suitable application for parallel processing, the use of […]
Dec, 24

Sorting on GPUs for large scale datasets: A thorough comparison

Although sort has been extensively studied in many research works, it still remains a challenge in particular if we consider the implications of novel processor technologies such as manycores (i.e. GPUs, Cell/BE, multicore, etc.). In this paper, we compare different algorithms for sorting integers on stream multiprocessors and we discuss their viability on large datasets […]
Dec, 24

An Experiment in Parallelizing the Fast Fourier Transform

We present the parallel implementation of two new algorithms developed for the discrete cosine transform. These algorithms support the new interleaved fast Fourier transform method. Our techniques were realized using the MPI standard library and executed on a variety of equipment for comparison. The results indicate a promising fresh direction in the search for efficient […]
Dec, 24

Quasi-maximum Accuracy Floating-point Computations with GPGPU for Applications in Digital Signal Processing

An idea of the use of two accumulators for improvement of the precision of floating-point computations with graphic processing units (GPUs) is presented in this paper for applications in digital signal processing. The increase of the precision of computations does not need any increase of the length of the data words. This is particularly important […]
Dec, 24

Using Artificial Intelligence in Computational Games

This paper presents the conceptual definition of a framework to help in the construction of intelligent games, where Artificial Intelligence Techniques could be inserted in the game in an easier way. The requirements analysis of the main AI techniques are presented as well as the preliminary results of this framework.
Dec, 24

Scalable multi-GPU implementation of the MAGFLOW simulator

We have developed a robust and scalable multi-GPU (Graphics Processing Unit) version of the cellular-automaton-based MAGFLOW lava simulator. The cellular automaton is partitioned into strips that are assigned to different GPUs, with minimal overlapping. For each GPU, a host thread is launched to manage allocation, deallocation, data transfer and kernel launches; the main host thread […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: