2501

Posts

Jan, 5

A complete modular resultant algorithm targeted for realization on graphics hardware

This paper presents a complete modular approach to computing bivariate polynomial resultants on Graphics Processing Units (GPU). Given two polynomials, the algorithm first maps them to a prime field for sufficiently many primes, and then processes each modular image individually. We evaluate each polynomial at several points and compute a set of univariate resultants for […]
Jan, 5

Modular Resultant Algorithm for Graphics Processors

In this paper we report on the recent progress in computing bivariate polynomial resultants on Graphics Processing Units (GPU). Given two polynomials in Z[x,y], our algorithm first maps the polynomials to a prime field. Then, each modular image is processed individually. The GPU evaluates the polynomials at a number of points and computes univariate modular […]
Jan, 5

Ubiquitous Parallel Computing from Berkeley, Illinois, and Stanford

The ParLab at Berkeley, UPCRC-Illinois, and the Pervasive Parallel Laboratory at Stanford are studying how to make parallel programming succeed given industry’s recent shift to multicore computing. All three centers assume that future microprocessors will have hundreds of cores and are working on applications, programming environments, and architectures that will meet this challenge. This article […]
Jan, 4

Nuclei: GPU-Accelerated Many-Core Network Coding

While it is a well known result that network coding achieves optimal flow rates in multicast sessions, its potential for practical use has remained to be a question, due to its high computational complexity. Our previous work has attempted to design a hardware-accelerated and multi-threaded implementation of network coding to fully utilize multi-core CPUs, as […]
Jan, 4

Practical Random Linear Network Coding on GPUs

Recently, random linear network coding has been widely applied in peer-to-peer network applications. Instead of sharing the raw data with each other, peers in the network produce and send encoded data to each other. As a result, the communication protocols have been greatly simplified, and the applications experience higher end-to-end throughput and better robustness to […]
Jan, 4

A parameterisable and scalable Smith-Waterman algorithm implementation on CUDA-compatible GPUs

This paper describes a multi-threaded parallel design and implementation of the Smith-Waterman (SM) algorithm on compute unified device architecture (CUDA)-compatible graphic processing units (GPUs). A novel technique has been put forward to solve the restriction on the length of the query sequence in previous GPU implementations of the Smith-Waterman algorithm. The main reasons behind this […]
Jan, 4

Approximate Belief Propagation by Hierarchical Averaging of Outgoing Messages

This paper presents an approximate belief propagation algorithm that replaces outgoing messages from a node with the averaged outgoing message and propagates messages from a low resolution graph to the original graph hierarchically. The proposed method reduces the computational time by half or two-thirds and reduces the required amount of memory by 60% compared with […]
Jan, 4

Speeding up subset seed algorithm for intensive protein sequence comparison

Sequence similarity search is a common and repeated task in molecular biology. The rapid growth of genomic databases leads to the need of speeding up the treatment of this task. In this paper, we present a subset seed algorithm for intensive protein sequence comparison. We have accelerated this algorithm by using indexing technique and fine […]
Jan, 4

Fast GPU-based space-time correlation for activity recognition in video sequences

Action recognition is becoming an important component of many computer vision applications such as video surveillance, video indexing and browsing. However most of the space time approaches to action recognition are very computationally expensive which prevents us from using them in real-time applications. This paper describes how Graphic Processing Units (GPUs) can be used in […]
Jan, 4

GPU-Assisted Z-Field Simplification

Height fields and depth maps which we collectively refer to as z-fields, usually carry a lot of redundant information and are often used in real-time applications. This is the reason why efficient methods for their simplification are necessary. On the other hand, the computation power and programmability of commodity graphics hardware has significantly grown. We […]
Jan, 4

Local Search Algorithms on Graphics Processing Units. A Case Study: The Permutation Perceptron Problem

Optimization problems are more and more complex and their resource requirements are ever increasing. Although metaheuristics allow to significantly reduce the computational complexity of the search process, the latter remains time-consuming for many problems in diverse domains of application. As a result, the use of GPU has been recently revealed as an efficient way to […]
Jan, 4

Efficient Multiplication of Polynomials on Graphics Hardware

We present the algorithm to multiply univariate polynomials with integer coefficients efficiently using the Number Theoretic transform (NTT) on Graphics Processing Units (GPU). The same approach can be used to multiply large integers encoded as polynomials. Our algorithm exploits fused multiply-add capabilities of the graphics hardware. NTT multiplications are executed in parallel for a set […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: