Nov, 13

OpenCL-based optimizations for acceleration of object tracking on FPGAs and GPUs

OpenCL support across many heterogeneous nodes (FPGAs, GPUs, CPUs) has increased the programmability of these systems significantly. At the same time, it opens up new challenges and design choices for system designers and application programmers. While OpenCL offers a universal semantic to capture the parallel behavior of applications independent of the target architecture, some customization […]
Nov, 13

Executing Dynamic Data Rate Actor Networks on OpenCL Platforms

Heterogeneous computing platforms consisting of general purpose processors (GPPs) and graphics processing units (GPUs) have become commonplace in personal mobile devices and embedded systems. For years, programming of these platforms was very tedious and simultaneous use of all available GPP and GPU resources required low-level programming to ensure efficient synchronization and data transfer between processors. […]
Nov, 10

PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks

Convolutional neural networks (CNNs) have been widely employed in many applications such as image classification, video analysis and speech recognition. Being compute-intensive, CNN computations are mainly accelerated by GPUs with high power dissipations. Recently, studies were carried out exploiting FPGA as CNN accelerator because of its reconfigurability and energy efficiency advantage over GPU, especially when […]
Oct, 22

Energy-efficient FPGA Implementation of the k-Nearest Neighbors Algorithm Using OpenCL

Modern SoCs are getting increasingly heterogeneous with a combination of multi-core architectures and hardware accelerators to speed up the execution of compute-intensive tasks at considerably lower power consumption. Modern FPGAs, due to their reasonable execution speed and comparatively lower power consumption, are strong competitors to the traditional GPU based accelerators. High-level Synthesis (HLS) simplifies FPGA […]
Oct, 8

A Runtime Controller for OpenCL Applications on Heterogeneous System Architectures

Heterogeneous architectures nowadays are becoming very attractive in the embedded and mobile markets thanks to the possibility to exploit the best computational resource to optimize the performance per Watt figure of merit. Unfortunately, deciding the right resource to use and its operating frequency is a difficult problem that depends on the actual conditions in which […]
Sep, 30

Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs

Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementations of […]
Sep, 22

Tuning Stencil Codes in OpenCL for FPGAs

OpenCL is designed as a parallel programming framework to support heterogeneous computing platforms. The implicit or explicit parallelism in OpenCL kernel code enables efficient FPGA implementation from a high-level programming abstraction. However, FPGA architecture is completely different from GPU architecture, for which OpenCL is widely used. Tuning OpenCL codes to achieve high performance on FPGAs […]
Sep, 22

Efficient dictionary learning implementation on the GPU using OpenCL

The dictionary learning field offers a wide range of algorithms that are able to provide good sparse approximations and well trained dictionaries. These algorithms are very complex and this is reflected in the slow execution of their computationally intensive implementations. This article proposes efficient parallel implementations for the main algorithms in the field that significantly […]
Sep, 8

OpenCL/CUDA algorithms for parallel decoding of any irregular LDPC code using GPU

This article provides a scalable parallel approach of an iterative LDPC decoder, presented in a tutorial-based style. The proposed approach can be implemented in applications supporting massive parallel computing. The proposed mapping is suitable for decoding any irregular LDPC code without the limitation of the maximum node degree. The implementation of the LDPC decoder with […]
Sep, 6

cf4ocl: a C framework for OpenCL

OpenCL is an open standard for parallel programming of heterogeneous compute devices, such as GPUs, CPUs, DSPs or FPGAs. However, the verbosity of its C host API can hinder application development. In this paper we present cf4ocl, a software library for rapid development of OpenCL programs in pure C. It aims to reduce the verbosity […]
Aug, 23

Fast Multidimensional Image Processing with OpenCL

Multidimensional image data, i.e., images with three or more dimensions, are used in many areas of science. Multidimensional image processing is supported in Python and MATLAB. VisionGL is an open source library that provides a set of image processing functions and can help the programmer by automatically generating code. The objective of this work is […]
Aug, 16

Automatic Generation of OpenCL Code for ARM Architectures

The efficient exploitation of the increasing computational capabilities of mobile devices is still a challenge. The heterogeneity of Systems on Chip (SoC) makes necessary a very specific knowledge of their hardware in order to harness their full potential. OpenCL is a well known standard for cross-platform usage of accelerator devices. We follow an annotation-based approach […]
Page 5 of 102« First...34567...102030...Last »

* * *

* * *

HGPU group © 2010-2018 hgpu.org

All rights belong to the respective authors

Contact us: