Posts
May, 12
Parallel Cryptanalysis
Most of today’s cryptographic primitives are based on computations that are hard to perform for a potential attacker but easy to perform for somebody who is in possession of some secret information, the key, that opens a back door in these hard computations and allows them to be solved in a small amount of time. […]
May, 12
Multi-dimensional characterization of electrostatic surface potential computation on graphics processors
BACKGROUND: Calculating the electrostatic surface potential (ESP) of a biomolecule is critical towards understanding biomolecular function. Because of its quadratic computational complexity (as a function of the number of atoms in a molecule), there have been continual efforts to reduce its complexity either by improving the algorithm or the underlying hardware on which the calculations […]
May, 12
Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous GPU Applications
In this paper we identify important classes of program control flows in applications targeted to commercially available graphics processing units (GPUs) and characterize their presence in real workloads such as those that occur in CUDA and OpenCL. Broadly, control flow can be characterized as structured or unstructured. It is shown that most existing techniques for […]
May, 12
Enhancing GPU Parallelism in Nature-Inspired Algorithms
We present GPU implementations of two different nature-inspired optimization methods for well-known optimization problems. Ant Colony Optimization (ACO) is a two-stage population-based method modelled on the foraging behaviour of ants, while P systems provide a high-level computational modelling framework that combines the structure and dynamic aspects of biological systems (in particular, their parallel and non-deterministic […]
May, 11
Efficient Parallelization of Natural Language Applications using GPUs
As we enter the era of mobile computing, high-quality and efficient natural language applications become more and more important, which greatly facilitate intelligent human-computer interaction. Unfortunately, most high-quality natural language applications employ large statistical models, which render them impractical for real-time use. Meanwhile, Graphics Processor Units (GPUs) have become widely available, offering the opportunity to […]
May, 11
Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System
The Uintah Computational Framework was developed to provide an environment for solving fluid-structure interaction problems on structured adaptive grids on large-scale, long-running, data-intensive problems. Uintah uses a combination of fluid-flow solvers and particle-based methods for solids, together with a novel asynchronous task-based approach with fully automated load balancing. Uintah demonstrates excellent weak and strong scalability […]
May, 11
Data Regression with Normal Equation on GPU using CUDA
Demand in the consumer market for graphics hardware that accelerates rendering of 3D images has resulted in Graphic Cards that are capable of delivering astonishing levels of performance. These results were achieved by specifically tailoring the hardware for the target domain. As graphics accelerators become increasingly programmable however, this performance has made them an attractive […]
May, 11
Large scale parallel state space search utilizing graphics processing units and solid state disks
The evolution of science is a double-track process composed of theoretical insights on the one hand and practical inventions on the other one. While in most cases new theoretical insights motivate hardware developers to produce systems following the theory, in some cases the shown hardware solutions force theoretical research to forecast the results to expect. […]
May, 11
CAPRI: Prediction of Compaction-Adequacy for Handling Control-Divergence in GPGPU Architectures
Wide SIMD-based GPUs have evolved into a promising platform for running general purpose workloads. Current programmable GPUs allow even code with irregular control to execute well on their SIMD pipelines. To do this, each SIMD lane is considered to execute a logical thread where hardware ensures that control flow is accurate by automatically applying masked […]
May, 10
Enhancing data parallelism for Ant Colony Optimization on GPUs
Graphics Processing Units (GPUs) have evolved into highly parallel and fully programmable architectures over the past five years, and the advent of CUDA has facilitated their application to many real-world applications. In this paper, we deal with a GPU implementation of Ant Colony Optimisation (ACO), a population-based optimisation method which comprises two major stages: Tour […]
May, 10
A GPU-Accelerated Algorithm for Self-Organizing Maps in a Distributed Environment
In this paper we introduce a MapReduce-based implementation of self-organizing maps that performs compute-bound operations on distributed GPUs. The kernels are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050 attached to each, […]
May, 10
An Efficient Common Substrings Algorithm for On-the-Fly Behavior-Based Malware Detection and Analysis
It is well known that malware (worms, botnets, etc…) thrive on communication systems. The process of detecting and analyzing malware is very latent and not well-suited for real-time application, which is critical especially for propagating malware. For this reason, recent methods identify similarities among malware dynamic trace logs to extract malicious behavior snippets. These snippets […]