19032

Posts

Aug, 5

Mapping a Guided Image Filter on the HARP Reconfigurable Architecture Using OpenCL

Intel recently introduced the Heterogeneous Architecture Research Platform, HARP. In this platform, the Central Processing Unit and a Field-Programmable Gate Array are connected through a high-bandwidth, low-latency interconnect and both share DRAM memory. For this platform, Open Computing Language (OpenCL), a High-Level Synthesis (HLS) language, is made available. By making use of HLS, a faster […]
Jul, 24

Assessing the feasibility of OpenCL CPU implementations for agent-based simulations

Agent-based modeling (ABM) is a bottom-up modeling approach, where each entity of the system being modeled is uniquely represented as a self-determining agent. Large scale emergent behavior in ABMs is population sensitive. As such, it is advisable that the number of agents in a simulation is able to reflect the reality of the system being […]
Jul, 14

A Translation Framework from RVC-CAL Dataflow Programs to OpenCL/SYCL based Implementations

Conventional programming languages nowadays still rely on sequential Models of Computation (MoC). However, the hardware makes more and more use of parallelism to increase the performance, e.g. an increasing number of cores. Nevertheless, programming languages, that still rely on sequential MoCs are not well suited to completely utilise this hardware. Dataflow programming languages like RVC-CAL […]
Jul, 7

Exploring Portability and Performance of OpenCL FPGA Kernels on Intel HARPv2

FPGAs offer a heterogenous compute solution to the continuous desire for increased performance by enabling the creation of applicationspecific hardware that accelerates computation. While the barrier to entry has historically been steep, advances in High Level Synthesis (HLS) are making FPGAs more accessible. Specifically, the Intel FPGA OpenCL SDK allows software designers to abstract away […]
Jun, 30

Automated Generation of OpenCL Programs Based on Algebra-Algorithmic Approach

The paper proposes the further development of algebra-algorithmic design and synthesis tools towards the development of OpenCL programs. The method for semi-automatic parallelization of cyclic operators is proposed. The particular feature of the approach consists in using high-level algebraalgorithmic program specifications (schemes) and rewriting rules technique. The developed tools provide the construction of parallel algorithm […]
Jun, 9

PPOpenCL: a performance-portable OpenCL compiler with host and kernel thread code fusion

OpenCL offers code portability but no performance portability. Given an OpenCL program X specifically written for one platform P, existing OpenCL compilers, which usually optimize its host and kernel codes individually, often yield poor performance for another platform Q. Instead of obtaining a performance-improved version of X for Q via manual tuning, we aim to […]
May, 23

A Case Study: Exploiting Neural Machine Translation to Translate CUDA to OpenCL

The sequence-to-sequence (seq2seq) model for neural machine translation has significantly improved the accuracy of language translation. There have been new efforts to use this seq2seq model for program language translation or program comparisons. In this work, we present the detailed steps of using a seq2seq model to translate CUDA programs to OpenCL programs, which both […]
May, 19

Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL

Recent technological advances have proliferated the available computing power, memory, and speed of modern Central Processing Units (CPUs), Graphics Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs). Consequently, the performance and complexity of Artificial Neural Networks (ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs) currently offer state-of-the-art performance, they consume large amounts […]
May, 5

Compressed Learning of Deep Neural Networks for OpenCL-Capable Embedded Systems

Deep neural networks (DNNs) have been quite successful in solving many complex learning problems. However, DNNs tend to have a large number of learning parameters, leading to a large memory and computation requirement. In this paper, we propose a model compression framework for efficient training and inference of deep neural networks on embedded systems. Our […]
May, 1

Transparent Acceleration for Heterogeneous Platforms With Compilation to OpenCL

Multi-accelerator platforms combine CPUs and different accelerator architectures within a single compute node. Such systems are capable of processing parallel workloads very efficiently while being more energy efficient than regular systems consisting of CPUs only. However, the architectures of such systems are diverse, forcing developers to port applications to each accelerator using different programming languages, […]
Apr, 14

Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN

Over the past few years machine learning has seen a renewed explosion of interest, following a number of studies showing the effectiveness of neural networks in a range of tasks which had previously been considered incredibly hard. Neural networks’ effectiveness in the fields of image recognition and natural language processing stems primarily from the vast […]
Apr, 14

OpenCL vs: Accelerated Finite-Difference Digital Synthesis

Digital audio synthesis has become an important component of modern music production with techniques that can produce realistic simulations of real instruments. Physical modelling sound synthesis is a category of audio synthesis that uses mathematical models to emulate the physical phenomena of acoustic musical instruments including drum membranes, air columns and strings. The synthesis of […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: