14395

Posts

Aug, 7

Optimising Reconfigurable Systems for Real-time Applications

This thesis addresses the problem of designing real-time reconfigurable systems. Our first contribution of this thesis is to propose novel data structures and memory architectures for accelerating real-time proximity queries, with potential application to robotic surgery. We optimise performance while maintaining accuracy by several techniques including mixed precision, function transformation and streaming data flow. Significant […]
Aug, 7

Behavioral Spherical Harmonics for Long-Range Agents’ Interaction

We introduce behavioral spherical harmonic (BSH), a novel approach to efficiently and compactly represent the directional-dependent behavior of agent. BSH is based on spherical harmonics to project the directional information of a group of multiple agents to a vector of few coefficients; thus, BSH drastically reduces the complexity of the directional evaluation, as it requires […]
Aug, 7

A Survey Of Techniques for Architecting DRAM Caches

Recent trends of increasing core-count and memory/bandwidth-wall have led to major overhauls in chip architecture. In face of increasing cache capacity demands, researchers have now explored DRAM, which was conventionally considered synonymous to main memory, for designing large last level caches. Efficient integration of DRAM caches in mainstream computing systems, however, also presents several challenges […]
Aug, 6

Removing the Barrier for FPGA-Based OpenCL Data Center Servers

Data centers today are the backbone of the modern economy, from the server rooms that power small to midsize organizations to the enterprise data centers that support U.S. corporations and provide access to cloud computing services. According to the Natural Resources Defense Council, data centers are one of the largest and fastest-growing consumers of electricity […]
Aug, 6

Real-Time Pedestrian Detection With Deep Networks Cascades

We present a new real-time approach to object detection that exploits the efficiency of cascade classifiers with the accuracy of deep neural networks. Deep networks have been shown to excel at classification tasks, and their ability to operate on raw pixel input without the need to design special features is very appealing. However, deep nets […]
Aug, 6

Optimizing an OpenCL Application for Video Watermarking in FPGAs

Video streaming and downloading account for the majority of consumer Internet traffic and are a driving force behind cloud computing. The continually growing demand for this type of content is pushing video-processing applications out of specialized systems and into the data center. This shift in the deployment paradigm allows for the rapid scaling of computation […]
Aug, 5

Exploiting two-level parallelism by aggregating computing resources in task-based applications over accelerator-based machines

Computing platforms are now extremely complex providing an increasing number of CPUs and accelerators. This trend makes balancing computations between these heterogeneous resources performance critical. In this paper we tackle the task granularity problem and we propose aggregating several CPUs in order to execute larger parallel tasks and thus find a better equilibrium between the […]
Aug, 5

Semantic Pose using Deep Networks Trained on Synthetic RGB-D

In this work we address the problem of indoor scene understanding from RGB-D images. Specifically, we propose to find instances of common furniture classes, their spatial extent, and their pose with respect to generalized class models. To accomplish this, we use a deep, wide, multi-output convolutional neural network (CNN) that predicts class, pose, and location […]
Aug, 5

Flexible Linear Algebra Development and Scheduling with Cholesky Factorization

Modern high performance computing environments are composed of networks of compute nodes that often contain a variety of heterogeneous compute resources, such as multicore-CPUs, GPUs, and coprocessors. One challenge faced by domain scientists is how to efficiently use all these distributed, heterogeneous resources. In order to use the GPUs effectively, the workload parallelism needs to […]
Aug, 5

Target Marker: A Visual Marker for Long Distances and Detection in Realtime on Mobile Devices

This paper deals with the design, detection and recognition of a visual marker which can be detected and recognized over a long distance in realtime on a mobile device. The solution is based on a rotationally symmetric marker, a detection process based on estimated circle and ellipse parameters and a Hidden Markov Model based approach […]
Aug, 5

Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures

We study the performance of dense symmetric indefinite factorizations (Bunch-Kaufman and Aasen’s algorithms) on multicore CPUs with a Graphics Processing Unit (GPU). Though such algorithms are needed in many scientific and engineering simulations, obtaining high performance of the factorization on the GPU is difficult because the pivoting, required to ensure the numerical stability of the […]
Aug, 4

4th International Conference on Intelligent Mechatronics and Automation (ICIMA), 2016

Publication: Selected submission paper will be recommended to publish into the journals below: International Journal of Mechanical Engineering and Robotics Research ISSN: 2278-0149 Abstracting/Indexing: Index Corpernicus, ProQuest, UDL, Google Scholar, Open J-Gate; etc. Call for Paper: 1. AI, intelligent control, neuro-control, fuzzy control and their applications 2. Biomedical and rehabilitation engineering, prosthetics and artificial organs; […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: