14398

Posts

Aug, 7

DenseCut: Densely Connected CRFs for Realtime GrabCut

Figure-ground segmentation from bounding box input, provided either automatically or manually, has been extremely popular in the last decade and influenced various applications. A lot of research has focused on highquality segmentation, using complex formulations which often lead to slow techniques, and often hamper practical usage. In this paper we demonstrate a very fast segmentation […]
Aug, 7

Towards Distortion-Predictable Embedding of Neural Networks

Current research in Computer Vision has shown that Convolutional Neural Networks (CNN) give state-of-the-art performance in many classification tasks and Computer Vision problems. The embedding of CNN, which is the internal representation produced by the last layer, can indirectly learn topological and relational properties. Moreover, by using a suitable loss function, CNN models can learn […]
Aug, 7

Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi

Parallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator: […]
Aug, 7

Optimising Reconfigurable Systems for Real-time Applications

This thesis addresses the problem of designing real-time reconfigurable systems. Our first contribution of this thesis is to propose novel data structures and memory architectures for accelerating real-time proximity queries, with potential application to robotic surgery. We optimise performance while maintaining accuracy by several techniques including mixed precision, function transformation and streaming data flow. Significant […]
Aug, 7

Behavioral Spherical Harmonics for Long-Range Agents’ Interaction

We introduce behavioral spherical harmonic (BSH), a novel approach to efficiently and compactly represent the directional-dependent behavior of agent. BSH is based on spherical harmonics to project the directional information of a group of multiple agents to a vector of few coefficients; thus, BSH drastically reduces the complexity of the directional evaluation, as it requires […]
Aug, 7

A Survey Of Techniques for Architecting DRAM Caches

Recent trends of increasing core-count and memory/bandwidth-wall have led to major overhauls in chip architecture. In face of increasing cache capacity demands, researchers have now explored DRAM, which was conventionally considered synonymous to main memory, for designing large last level caches. Efficient integration of DRAM caches in mainstream computing systems, however, also presents several challenges […]
Aug, 6

Removing the Barrier for FPGA-Based OpenCL Data Center Servers

Data centers today are the backbone of the modern economy, from the server rooms that power small to midsize organizations to the enterprise data centers that support U.S. corporations and provide access to cloud computing services. According to the Natural Resources Defense Council, data centers are one of the largest and fastest-growing consumers of electricity […]
Aug, 6

Real-Time Pedestrian Detection With Deep Networks Cascades

We present a new real-time approach to object detection that exploits the efficiency of cascade classifiers with the accuracy of deep neural networks. Deep networks have been shown to excel at classification tasks, and their ability to operate on raw pixel input without the need to design special features is very appealing. However, deep nets […]
Aug, 6

Optimizing an OpenCL Application for Video Watermarking in FPGAs

Video streaming and downloading account for the majority of consumer Internet traffic and are a driving force behind cloud computing. The continually growing demand for this type of content is pushing video-processing applications out of specialized systems and into the data center. This shift in the deployment paradigm allows for the rapid scaling of computation […]
Aug, 5

Exploiting two-level parallelism by aggregating computing resources in task-based applications over accelerator-based machines

Computing platforms are now extremely complex providing an increasing number of CPUs and accelerators. This trend makes balancing computations between these heterogeneous resources performance critical. In this paper we tackle the task granularity problem and we propose aggregating several CPUs in order to execute larger parallel tasks and thus find a better equilibrium between the […]
Aug, 5

Semantic Pose using Deep Networks Trained on Synthetic RGB-D

In this work we address the problem of indoor scene understanding from RGB-D images. Specifically, we propose to find instances of common furniture classes, their spatial extent, and their pose with respect to generalized class models. To accomplish this, we use a deep, wide, multi-output convolutional neural network (CNN) that predicts class, pose, and location […]
Aug, 5

Flexible Linear Algebra Development and Scheduling with Cholesky Factorization

Modern high performance computing environments are composed of networks of compute nodes that often contain a variety of heterogeneous compute resources, such as multicore-CPUs, GPUs, and coprocessors. One challenge faced by domain scientists is how to efficiently use all these distributed, heterogeneous resources. In order to use the GPUs effectively, the workload parallelism needs to […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: