Posts
Aug, 11
SINGA: Putting Deep Learning in the Hands of Multimedia Users
Recently, deep learning techniques have enjoyed success in various multimedia applications, such as image classification and multimodal data analysis. Two key factors behind deep learning’s remarkable achievement are the immense computing power and the availability of massive training datasets, which enable us to train large models to capture complex regularities of the data. There are […]
Aug, 11
Optimizing strassen matrix multiply on GPUs
Many core systems are basically designed for applications having large data parallelism. Strassen Matrix Multiply (MM) can be formulated as a depth first (DFS) traversal of a recursion tree where all cores work in parallel on computing each of the NxN sub-matrices that reduces storage at the detriment of large data motion to gather and […]
Aug, 11
A Parallel Implementation of the Self Organising Map using OpenCL
The self organising map is a machine learning algorithm used to produce low dimensional representations of high dimensional data. While the process is becoming more and more useful with the rise of big data, it is hindered by the sheer amount of time the algorithm takes to run serially. This project produces a parallel version […]
Aug, 11
GPU-Disasm: A GPU-based x86 Disassembler
Static binary code analysis and reverse engineering are crucial operations for malware analysis, binary-level software protections, debugging, and patching, among many other tasks. Faster binary code analysis tools are necessary for tasks such as analyzing the multitude of new malware samples gathered every day. Binary code disassembly is a core functionality of such tools which […]
Aug, 10
Places205-VGGNet Models for Scene Recognition
VGGNets have turned out to be effective for object recognition in still images. However, it is unable to yield good performance by directly adapting the VGGNet models trained on the ImageNet dataset for scene recognition. This report describes our implementation of training the VGGNets on the large-scale Places205 dataset. Specifically, we train three VGGNet models, […]
Aug, 10
Practical Algorithms for Finding Extremal Sets
The minimal sets within a collection of sets are defined as the ones which do not have a proper subset within the collection, and the maximal sets are the ones which do not have a proper superset within the collection. Identifying extremal sets is a fundamental problem with a wide-range of applications in SAT solvers, […]
Aug, 10
CRINK: Automatic CUDA code generation for affine C programs
Parallel programming has largely evolved as an efficient solution to a large number of compute intensive applications. Graphics Processing Unit (GPUs), traditionally designed to process computer graphics, are now widely applied to process large chunks of data parallely in many computationally expensive applications. While developing parallel programs to run on parallel computing platforms, such as […]
Aug, 10
Visual, Spatial and Temporal Quality in Video-Based Reconstruction of People: Achieving, Prototyping and Evaluating
Capturing, recreating and representing a high fidelity virtual representation of the dynamic human form has long been a target for a diverse range of applications including tele-presence, games, film and TV special effects. The complexity of the challenge, to achieve a lifelike, faithful and believable representation, is such that a wide range of techniques and […]
Aug, 10
Accelerating the pre-processing stages of JPEG encoder on a heterogenous system using OpenCL
Color space conversion and downsampling are among the major computationally intensive steps in typical image and video codec standards, and accelerating these steps will improve the performances of these applications significantly. In this paper, we describe the parallel implementation of the color space conversion and downsampling as pre-processing steps for the JPEG encoder in a […]
Aug, 7
DenseCut: Densely Connected CRFs for Realtime GrabCut
Figure-ground segmentation from bounding box input, provided either automatically or manually, has been extremely popular in the last decade and influenced various applications. A lot of research has focused on highquality segmentation, using complex formulations which often lead to slow techniques, and often hamper practical usage. In this paper we demonstrate a very fast segmentation […]
Aug, 7
Towards Distortion-Predictable Embedding of Neural Networks
Current research in Computer Vision has shown that Convolutional Neural Networks (CNN) give state-of-the-art performance in many classification tasks and Computer Vision problems. The embedding of CNN, which is the internal representation produced by the last layer, can indirectly learn topological and relational properties. Moreover, by using a suitable loss function, CNN models can learn […]
Aug, 7
Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi
Parallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator: […]