1838

Posts

Nov, 27

Jitter analysis of PLL-generated clock propagation using Jitter Mitigation techniques with laser voltage probing

A new Jitter Mitigation feature in the latest generation laser voltage probing (LVP) tool effectively removes PLL jitter from LVP waveforms [Ng Yin S, Lo W, Wilsher K. Next generation laser voltage probing. In: Proceeding, international symposium on testing and failure analysis; 2008. p. 249]. It facilitates the probing of phase-locked loop (PLL) driven circuitry […]
Nov, 27

Optimizing the SUSAN corner detection algorithm for a high speed FPGA implementation

In many embedded systems for video surveillance distinctive features are used for the detection of objects. In this contribution a real-time FPGA implementation of a feature detector, namely the SUSAN algorithm is described. As the original SUSAN algorithm performs poorly on non-synthetic images a significant quality improvement of this algorithm is presented. The hardware accelerator […]
Nov, 27

Simulation of Shallow-Water systems using Graphics Processing Units

This paper addresses the speedup of the numerical solution of shallow-water systems in 2D domains by using modern Graphics Processing Units (GPUs). A first order well-balanced finite volume numerical scheme for 2D shallow water systems is considered. The potential data parallelism of this method is identified and the scheme is efficiently implemented on GPUs for […]
Nov, 27

A survey of BRDF models for computer graphics

To produce photo-realistic images in computer graphics, we must effectively describe the interactions between light and surfaces. In this paper, we focus on Bidirectional Reflectance Distribution Functions (BRDFs), which characterize these interactions. We survey on most BRDF representations introduced so far and we investigate their usage, importance and applications. We look at in detail their […]
Nov, 27

A massively parallel adaptive fast-multipole method on heterogeneous architectures

We present new scalable algorithms and a new implementation of our kernel-independent fast multipole method (Ying et al. ACM/IEEE SC ’03), in which we employ both distributed memory parallelism (via MPI) and shared memory/streaming parallelism (via GPU acceleration) to rapidly evaluate two-body non-oscillatory potentials. On traditional CPU-only systems, our implementation scales well up to 30 […]
Nov, 27

Stream processing for fast and efficient rotated Haar-like features using rotated integral images

An extended set of Haar-like features for image sensors beyond the standard vertically and horizontally aligned Haar-like features and the 45 degrees twisted Haar-like features are introduced. The extended rotated Haar-like features are based on the standard Haar-like features that have been rotated based on whole integer pixel-based rotations. These rotated feature values can also […]
Nov, 27

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

While general-purpose homogeneous multi-core architectures are becoming ubiquitous, there are clear indications that, for a number of important applications, a better performance/power ratio can be attained using specialized hardware accelerators. These accelerators require specific SDK or programming languages which are not always easy to program. Thus, the impact of the new programming paradigms on the […]
Nov, 27

Parallel smoothing of quad meshes

Abstract For use in real-time applications, we present a fast algorithm for converting a quad mesh to a smooth, piecewise polynomial surface on the Graphics Processing Unit (GPU). The surface has well-defined normals everywhere and closely mimics the shape of Catmull-Clark subdivision surfaces. It consists of bicubic splines wherever possible, and a new class of […]
Nov, 27

Quantum computer simulation using the CUDA programming model

Quantum computing emerges as a field that captures a great theoretical interest. Its simulation represents a problem with high memory and computational requirements which makes advisable the use of parallel platforms. In this work we deal with the simulation of an ideal quantum computer on the Compute Unified Device Architecture (CUDA), as such a problem […]
Nov, 27

Predictive Runtime Code Scheduling for Heterogeneous Architectures

Heterogeneous architectures are currently widespread. With the advent of easy-to-program general purpose GPUs, virtually every recent desktop computer is a heterogeneous system. Combining the CPU and the GPU brings great amounts of processing power. However, such architectures are often used in a restricted way for domain-specific applications like scientific applications and games, and they tend […]
Nov, 27

Exploiting graphical processing units for data-parallel scientific applications

Graphical processing units (GPUs) have recently attracted attention for scientific applications such as particle simulations. This is partially driven by low commodity pricing of GPUs but also by recent toolkit and library developments that make them more accessible to scientific programmers. We discuss the application of GPU programming to two significantly different paradigms – regular mesh field […]
Nov, 27

CUDA-MEME: Accelerating Motif Discovery in Biological Sequences Using CUDA-enabled Graphics Processing Units

Motif discovery in biological sequences is of prime importance and a major challenge in computational biology. Consequently, numerous motif discovery tools have been developed to date. However, the rapid growth of both genomic sequence and gene transcription data, establishes the need for the development of scalable motif discovery tools. An approach to improve the runtime […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: