Posts
Apr, 12
Automated development of applications for graphical processing units using rewriting rules
Recently there was an active development of parallel programming methods concerning implementation of general-purpose algorithms on graphical processing units (GPUs). Using this specialized hardware allows increasing performance significantly, but requires low-level programming and understanding details of underlying hardware and software platform. Therefore there is a need for automating development process. This paper presents a technique […]
Apr, 12
Stream-Centric Stereo Matching and View Synthesis: A High-Speed Approach on GPUs
In this paper, we propose a real-time image-based rendering (IBR) system. It is specifically designed for photorealistic view synthesis at high-speed on the graphics processing unit (GPU). We steer the proposed IBR system design with two high-level ideas. First, for cost-effective IBR, as long as the synthesized views look visually plausible, the estimated disparity and […]
Apr, 12
An Analytical Approach to the Design of Parallel Block Cipher Encryption/Decryption: A CPU/GPU Case Study
GPUs are at the fore-front of a radical transformation that is taking place in software design. The ability to process multiple data streams simultaneously is delivering substantial benefits to a large collection of domains. Depending on the application, these benefits can be expanded by utilizing the not-insignificant power of traditional CPUs. Multi-core CPUs with a […]
Apr, 12
GPU Accelerated Path-Planning for Multi-agents in Virtual Environments
Many games are populated by synthetic humanoid actors that act as autonomous agents. The animation of humanoids in real-time applications is yet a challenge if the problem involves attaining a precise location in a virtual world (path-planning), and moving realistically according to its own personality, intentions and mood (motion planning). In this paper we present […]
Apr, 12
The fast evaluation of hidden Markov models on GPU
It is compute-intensive to evaluate the probability of an observation sequence on a hidden Markov model. Some fast algorithms exit, the forward-backward procedure is the most popular one among them. The forward-backward procedure can save much computation, but its time complexity is N^2T, in other words, there is a high computational complexity in the algorithm. […]
Apr, 12
Accelerating System-Level Design Tasks Using Commodity Graphics Hardware: A Case Study
Many system-level design tasks (e.g. timing analysis, hardware/software partitioning and design space exploration) involve computational kernels that are intractable (usually NP-hard). As a result, they involve high running times even for mid-sized problems. In this paper we explore the possibility of using commodity graphics processing units (GPUs) to accelerate such tasks that commonly arise in […]
Apr, 12
Simulating Spiking Neural P systems without delays using GPUs
We present in this paper our work regarding simulating a type of P system known as a spiking neural P system (SNP system) using graphics processing units (GPUs). GPUs, because of their architectural optimization for parallel computations, are well-suited for highly parallelizable problems. Due to the advent of general purpose GPU computing in recent years, […]
Apr, 11
Interactive Simulation and Visualization of Fluids with Surface Raycasting
We present a method to couple particle-based fluid simulation methods such as Smoothed Particle Hydrodynamics (SPH) and volume rendering in order to visualize the fluid. A volume is generated from the fluid’s implicit density field so volume raycasting can be performed to render the surface on the GPU. The volume generation algorithm is also implemented […]
Apr, 11
Real-time 3-D object recognition using scale invariant feature transform and stereo vision
Scale invariant feature transform (SIFT) and stereo vision are applied together to recognize objects in real time. This work reports the performance of a GPU (graphic processing unit) based real-time feature detector in capturing the features of 3D objects when the objects undergo rotational and translational motions in cluttered backgrounds. We have compared the performance […]
Apr, 11
EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system
Future mainstream microprocessors will likely integrate specialized accelerators, such as GPUs, onto a single die to achieve better performance and power efficiency. However, it remains a keen challenge to program such a heterogeneous multicore platform, since these specialized accelerators feature ISAs and functionality that are significantly different from the general purpose CPU cores. In this […]
Apr, 11
CULA: hybrid GPU accelerated linear algebra routines
The modern graphics processing unit (GPU) found in many standard personal computers is a highly parallel math processor capable of nearly 1 TFLOPS peak throughput at a cost similar to a high-end CPU and an excellent FLOPS/watt ratio. High-level linear algebra operations are computationally intense, often requiring O(N3) operations and would seem a natural fit […]
Apr, 11
Clustering coefficient queries on massive dynamic social networks
The Clustering Coefficient (CC) is a fundamental measure in social network analysis assessing the degree to which nodes tend to cluster together. While CC computation on static graphs is well studied, emerging applications have new requirements for online query of the “global” CC of a given subset of a graph. As social networks are widely […]