Posts
Dec, 26
Accelerating Correlation Power Analysis Using Graphics Processing Units
Correlation Power Analysis (CPA) is a type of power analysis based side channel attack that can be used to derive the secret key of encryption algorithms including DES (Data Encryption Standard) and AES (Advanced Encryption Standard). A typical CPA attack on unprotected AES is performed by analysing a few thousand power traces that requires about […]
Dec, 26
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
Deep Convolutional Neural Networks (DCNNs) have recently shown state of the art performance in high level vision tasks, such as image classification and object detection. This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification (also called "semantic image segmentation"). We show that responses at the final […]
Dec, 26
Computationally Efficient Implementation of a Hamming Code Decoder using a Graphics Processing Unit
This paper presents a computationally efficient implementation of a Hamming code decoder on a graphics processing unit (GPU) to support real-time software-defined radio (SDR), which is a software alternative for realizing wireless communication. The Hamming code algorithm is challenging to parallelize effectively on a GPU because it works on sparsely located data items with several […]
Dec, 26
Fast Convolutional Nets With fbfft: A GPU Performance Evaluation
We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units. We introduce two new Fast Fourier Transform convolution implementations: one based on NVIDIA’s cuFFT library, and another based on a Facebook authored FFT implementation, fbfft, that provides significant speedups over cuFFT (over 1.5x) for whole CNNs. […]
Dec, 22
Legion: Programming Distributed Heterogeneous Architectures with Logical Regions
This thesis covers the design and implementation of Legion, a new programming model and runtime system for targeting distributed heterogeneous machine architectures. Legion introduces logical regions as a new abstraction for describing the structure and usage of program data. We describe how logical regions provide a mechanism for applications to express important properties of program […]
Dec, 22
A Domain-specific Language to Facilitate Software Defined Radio Parallel Executable Patterns Deployment on Heterogeneous Architectures
In this paper, we present a domain-specific language, referred to as OptiSDR, that matches high level digital signal processing (DSP) routines for software defined radio (SDR) to their generic parallel executable patterns targeted to heterogeneous computing architectures (HCAs). These HCAs includes a combination of hybrid GPU-CPU and DSP-FPGA architectures that are programmed using different programming […]
Dec, 22
Purine: A bi-graph based deep learning framework
In this paper, we introduce a novel deep learning framework, termed Purine. In Purine, a deep network is expressed as a bipartite graph (bi-graph), which is composed of interconnected operators and data tensors. With the bi-graph abstraction, networks are easily solvable with event-driven task dispatcher. We then demonstrate that different parallelism schemes over GPUs and/or […]
Dec, 22
Manycore processing of repeated k-NN queries over massive moving objects observations
The ability to timely process significant amounts of continuously updated spatial data is mandatory for an increasing number of applications. In this paper we focus on a specific data-intensive problem concerning the repeated processing of huge amounts of k nearest neighbours (k-NN) queries over massive sets of moving objects, where the spatial extents of queries […]
Dec, 22
PyFAI: a Python library for high performance azimuthal integration on GPU
The pyFAI package has been designed to reduce X-ray diffraction images into powder diffraction curves to be further processed by scientists. This contribution describes how to convert an image into a radial profile using the Numpy package, how the process was accelerated using Cython. The algorithm was parallelised, needing a complete re-design to benefit from […]
Dec, 22
GPU Pro 5: Advanced Rendering Techniques
In GPU Pro5: Advanced Rendering Techniques, section editors Wolfgang Engel, Christopher Oat, Carsten Dachsbacher, Michal Valient, Wessam Bahnassi, and Marius Bjorge have once again assembled a high-quality collection of cutting-edge techniques for advanced graphics processing unit (GPU) programming. Divided into six sections, the book covers rendering, lighting, effects in image space, mobile devices, 3D engine […]
Dec, 22
Accelerating Ab Initio Nuclear Physics Calculations with GPUs
This paper describes some applications of GPU acceleration in ab initio nuclear structure calculations. Specifically, we discuss GPU acceleration of the software package MFDn, a parallel nuclear structure eigensolver. We modify the matrix construction stage to run partly on the GPU. On the Titan supercomputer at the Oak Ridge Leadership Computing Facility, this produces a […]
Dec, 22
GPGPU-Sim
This thesis studies the impact of hardware features of graphics cards on performance of GPU computing using GPGPU-Sim simulation software tool. GPU computing is a growing topic in the world of computing, and could be an important milestone for computers. Therefore, such a study that seeks to identify the performance bottlenecks of the program with […]