14436

Posts

Aug, 18

Optimizing OpenCL Local Work Group Size With Machine Learning

GPU architectures are becoming increasingly important due to their high number of processors. The single input multiple data architecture has proven to work not just for the graphics domain, but also for many other disciplines. This is due to the potential performance that can be achieved by a consumer-level GPU being significantly higher than the […]
Aug, 18

OpenCL-Based Design of an FPGA Accelerator for Phase-Based Correspondence Matching

This paper proposes a Field Programmable Gate Array (FPGA) implementation of the stereo correspondence matching using Phase-Only Correlation (POC). The use of high-accuracy stereo correspondence matching based on POC makes it possible to measure accurate 3D shape of an object using stereo vision. The drawback of the POC-based approach is its high computational cost. To […]
Aug, 11

A Parallel Implementation of the Self Organising Map using OpenCL

The self organising map is a machine learning algorithm used to produce low dimensional representations of high dimensional data. While the process is becoming more and more useful with the rise of big data, it is hindered by the sheer amount of time the algorithm takes to run serially. This project produces a parallel version […]
Aug, 10

Accelerating the pre-processing stages of JPEG encoder on a heterogenous system using OpenCL

Color space conversion and downsampling are among the major computationally intensive steps in typical image and video codec standards, and accelerating these steps will improve the performances of these applications significantly. In this paper, we describe the parallel implementation of the color space conversion and downsampling as pre-processing steps for the JPEG encoder in a […]
Aug, 6

Removing the Barrier for FPGA-Based OpenCL Data Center Servers

Data centers today are the backbone of the modern economy, from the server rooms that power small to midsize organizations to the enterprise data centers that support U.S. corporations and provide access to cloud computing services. According to the Natural Resources Defense Council, data centers are one of the largest and fastest-growing consumers of electricity […]
Aug, 6

Optimizing an OpenCL Application for Video Watermarking in FPGAs

Video streaming and downloading account for the majority of consumer Internet traffic and are a driving force behind cloud computing. The continually growing demand for this type of content is pushing video-processing applications out of specialized systems and into the data center. This shift in the deployment paradigm allows for the rapid scaling of computation […]
Jul, 27

An efficient KNN algorithm implemented on FPGA based heterogeneous computing system using OpenCL

Accurate and efficient data classification techniques are of vital importance to many problems, and are rapidly developing in recent decades. K-Nearest Neighbor algorithm (KNN), as one of the most important algorithms, is widely used in text categorization, predictive analysis, data mining and image recognition, etc. To accelerate the algorithm and to optimize the parallel implementation […]
Jul, 20

Parallel Programming in Actor-Based Applications via OpenCL

GPU and multicore hardware architectures are commonly used in many different application areas to accelerate problem solutions relative to single CPU architectures. The typical approach to accessing these hardware architectures requires embedding logic into the programming language used to construct the application; the two primary forms of embedding are: calls to API routines to access […]
Jul, 20

Improving performance portability for GPU-specific OpenCL kernels on multi-core/many-core CPUs by analysis-based transformations

OpenCL is an open heterogeneous programming framework. Although OpenCL programs are functionally portable, it does not provide performance portability, so code transformation often plays an irreplaceable role. When adapting GPU-specific OpenCL kernels to run on multi-core/many-core CPUs, coarsening the thread granularity is necessary and thus extensively used. However, locality concerns exposed in GPU-specific OpenCL code […]
Jul, 17

Overhauling SC atomics in C11 and OpenCL

Despite the conceptual simplicity of sequential consistency (SC), the semantics of SC atomic operations and fences in the C11 and OpenCL memory models is subtle, leading to convoluted prose descriptions that translate to complex axiomatic formalisations. We conduct an overhaul of SC atomics in C11, reducing the associated axioms in both number and complexity. A […]
Jul, 15

Concurrent Manipulation of Dynamic Data Structures in OpenCL

With the emergence of general purpose GPU (GPGPU) programming, concurrent data processing of large arrays of data has gained a significant boost in performance. However, due to the memory architecture between the host and GPU device and other limitations in the instructions available on GPUs, the implementation of dynamic data structures, like linked list and […]
Jul, 13

A Study of Data Partitioning on OpenCL-based FPGAs

A lot of research efforts have been devoted to accelerating relational database applications on FPGAs, due to their high energy efficiency and high throughput. Most of the existing studies are based on hardware description languages (HDLs). Recently, FPGA vendors have started to develop OpenCL SDKs for much better programmability. In this paper, we investigate the […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: