Posts
Oct, 6
Real-time Multi-view Depth Generation Using CUDA Multi-GPU
In this paper, we propose a real-time multi-view depth generation method using compute unified device architecture (CUDA) multi-graphics processing units (GPU). The objective is to generate multi-view depth maps in real-time. We employ eight color cameras and three depth cameras. After capturing multi-view color and depth data, we warp the depth information to color camera […]
Oct, 6
Accelerating NTRU Encryption with Graphics Processing Units
Lattice based cryptography is attractive for its quantum computing resistance and efficient encryption/ decryption process. However, the Big Data issue has perplexed most lattice based cryptographic systems since the overall processing is slowed down too much. This paper intends to analyze one of the major lattice-based cryptographic systems, Nth-degree truncated polynomial ring (NTRU), and accelerate […]
Oct, 6
Load Balancing in Data Warehouse – Evolution and Perspectives
The problem of load balancing is one of the crucial features in distributed data warehouse systems. In this article original load balancing algorithms are presented. The Adaptive Load Balancing Algorithms for Queries (ALBQ) and the algorithms that use grammars and learning machines in managing the ETL process. These two algorithms base the load balancing on […]
Oct, 6
Embedding GPU Computations in Hadoop
As the size of high performance applications increases, four major challenges including heterogeneity, programmability, fault resilience, and energy efficiency have arisen in the underlying distributed systems. To tackle with all of them without sacrificing performance, traditional approaches in resource utilization, task scheduling and programming paradigm should be reconsidered. While Hadoop has handled data-intensive applications well […]
Oct, 4
Deep Dynamic Neural Networks for Gesture Segmentation and Recognition
The purpose of this paper is to describe a novel method called Deep Dynamic Neural Networks(DDNN) for the Track 3 of the Chalearn Looking at People 2014 challenge [1]. A generalised semi-supervised hierarchical dynamic framework is proposed for simultaneous gesture segmentation and recognition taking both skeleton and depth images as input modules. First, Deep Belief […]
Oct, 4
GPU Accelerated Radio Wave Propagation Modeling Using Ray Tracing
Radar producers, which are mostly in defense industry, need radar environment simulator to test their products during the development. Such a simulator helps them to be able to get rid of costly field tests. For developing a radar environment simulator, radio wave propagation should be modeled. However, this is a computationally expensive and time consuming […]
Oct, 4
Parallel Shortest Path Algorithm for Voronoi Diagrams with Generalized Distance Functions
Voronoi diagrams are fundamental data structures in computational geometry with applications on different areas. Recent soft object simulation algorithms for real time physics engines require the computation of Voronoi diagrams over 3D images with non-Euclidean distances. In this case, the computation must be performed over a graph, where the edges encode the required distance information. […]
Oct, 4
Teaching Parallel Programming Using Java
This paper presents an overview of the "Applied Parallel Computing" course taught to final year Software Engineering undergraduate students in Spring 2014 at NUST, Pakistan. The main objective of the course was to introduce practical parallel programming tools and techniques for shared and distributed memory concurrent systems. A unique aspect of the course was that […]
Oct, 4
A massively parallel algorithm for constructing the BWT of large string sets
We present a new scalable, lightweight algorithm to incrementally construct the BWT and FM-index of large string sets such as those produced by Next Generation Sequencing. The algorithm is designed for massive parallelism and can effectively exploit the combination of low capacity high bandwidth memory and slower external system memory typical of GPU accelerated systems. […]
Oct, 3
Fast Automatic Heuristic Construction Using Active Learning
Building effective optimization heuristics is a challenging task which often takes developers several months if not years to complete. Predictive modelling has recently emerged as a promising solution, automatically constructing heuristics from training data. However, obtaining this data can take months per platform. This is becoming an ever more critical problem and if no solution […]
Oct, 3
Microarchitectural Performance Characterization of Irregular GPU Kernels
GPUs are increasingly being used to accelerate general-purpose applications, including applications with data-dependent, irregular memory access patterns and control flow. However, relatively little is known about the behavior of irregular GPU codes, and there has been minimal effort to quantify the ways in which they differ from regular GPGPU applications. We examine the behavior of […]
Oct, 3
Stable large-scale solver for Ginzburg-Landau equations for superconductors
Understanding the interaction of vortices with inclusions in type-II superconductors is a major outstanding challenge both for fundamental science and energy applications. At application-relevant scales, the long-range interactions between a dense configuration of vortices and the dependence of their behavior on external parameters, such as temperature and an applied magnetic field, are all important to […]