Tags Results


Mar, 5

Multi-kernel Data Partitioning with Channel on OpenCL-based FPGAs

FPGAs have been widely used to accelerate relational database applications, due to their high throughput and high energy efficiency. However, hardware programmer needs to leverage hardware description languages (HDLs) to program FPGAs. Since HDL is cycle-sensitive and error-prone, deep knowledge about hardware design and hands-on experiences are required to guarantee a successful design on FPGA, […]
Feb, 26

Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network

OpenCL FPGA has recently gained great popularity with emerging needs for workload acceleration such as Convolutional Neural Network (CNN), which is the most popular deep learning architecture in the domain of computer vision. While OpenCL enhances the code portability and programmability of FPGA, it comes at the expense of performance. The key challenge is to […]
Jan, 31

CFP: Fifth International Workshop on OpenCL (IWOCL 2017) – EXTENDED

Now in its fifth year, the International Workshop on OpenCL (IWOCL) will be hosted by The University of Toronto, Canada, at the Bahen Centre on May 16th-18th 2017. May 16th sees two activities: an Advanced Hands On OpenCL tutorial and a SYCL workshop, while May 17th and 18th will include of a mix of keynotes, […]
Jan, 26

Accelerating Workloads on FPGAs via OpenCL: A Case Study with OpenDwarfs

For decades, the streaming architecture of FPGAs has delivered accelerated performance across many application domains, such as option pricing solvers in finance, computational fluid dynamics in oil and gas, and packet processing in network routers and firewalls. However, this performance comes at the expense of programmability. FPGA developers use hardware design languages (HDLs) to implement […]
Jan, 16

An OpenCL(TM) Deep Learning Accelerator on Arria 10

Convolutional neural nets (CNNs) have become a practical means to perform vision tasks, particularly in the area of image classification. FPGAs are well known to be able to perform convolutions efficiently, however, most recent efforts to run CNNs on FPGAs have shown limited advantages over other devices such as GPUs. Previous approaches on FPGAs have […]
Jan, 10

An FPGA Accelerator for Molecular Dynamics Simulation Using OpenCL

Molecular dynamics (MD) simulations are very important to study physical properties of the atoms and molecules. However, a huge amount of processing time is required to simulate a few nano-seconds of an actual experiment. Although the hardware acceleration using FPGAs provides promising results, huge design time and hardware design skills are required to implement an […]
Dec, 31

Automatic OpenCL Task Adaptation for Heterogeneous Architectures

OpenCL defines a common parallel programming language for all devices, although writing tasks adapted to the devices, managing communication and load-balancing issues are left to the programmer. In this work, we propose a novel automatic compiler and runtime technique to execute single OpenCL kernels on heterogeneous multi-device architectures. The technique proposed is completely transparent to […]
Dec, 31

dOpenCL – Evaluation of an API-Forwarding Implementation

Parallel workloads using compute resources such as GPUs and accelerators is a rapidly developing trend in the field of high performance computing. At the same time, virtualization is a generally accepted solution to share compute resources with remote users in a secure and isolated way. However, accessing compute resources from inside virtualized environments still poses […]
Dec, 18

The 5th International Workshop on OpenCL (IWOCL), 2017

The International Workshop on OpenCL (IWOCL) is an annual meeting of OpenCL users, researchers, developers and suppliers to share OpenCL best practise, and to promote the evolution and advancement of the OpenCL standard. The meeting is open to anyone who is interested in contributing to, and participating in the OpenCL community. IWOCL is the premier […]
Dec, 14

Translating OpenMP Device Constructs to OpenCL using Unnecessary Data Transfer Elimination

In this paper, we propose a framework that translates OpenMP 4.0 accelerator directives to OpenCL. By translating an OpenMP program to an OpenCL program, the program can be executed on any hardware platform that supports OpenCL. We also propose a run-time optimization technique that automatically eliminates unnecessary data transfers between the host and the target […]
Nov, 30

Hardware thread reordering to boost OpenCL throughput on FPGAs

Availability of OpenCL for FPGAs has raised new questions about the efficiency of massive thread-level parallelism on FPGAs. The general trend is toward creating deep pipelining and in-order execution of many OpenCL threads across a shared data-path. While this can be a very effective approach for regular kernels, its efficiency significantly diminishes for irregular kernels […]
Nov, 23

Performance Analysis of CUDA and OpenCL By Implementation of Cryptographic Algorithms

This paper presents a Performance Analysis of CUDA and OpenCL. Three different cryptographic algorithms, i.e. DES, MD5, and SHA-1 have been selected as the benchmarks for extensive analysis of the performance gaps between the two. Our results show that, on the average scenario, CUDA performs 27% better than OpenCL while in the best case scenario […]
Page 1 of 9512345...102030...Last »

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: