13331

Posts

Jan, 9

International Workshop on OpenCL

The International Workshop on OpenCL (IWOCL – “eye-wok-ul”) is an annual meeting and community of users, researchers, developers and suppliers that share best practice, and promote the evolution and advancement of the OpenCL standard for parallel programming of heterogeneous systems.
Jan, 8

CHO: A Benchmark Suite for OpenCL-based FPGA Accelerators

Programming FPGAs with OpenCL-based high-level synthesis frameworks is gaining attention with a number of commercial and research frameworks announced. However, there are no benchmarks for evaluating these frameworks. To this end, we present CHO benchmark suite an extension of CHStone, a commonly used C-based high-level synthesis benchmark suite, for OpenCl. We characterise CHO at various […]
Jan, 2

Customization of OpenCL Applications for Efficient Task Mapping under Heterogeneous Platform Constraints

When targeting an OpenCL application to platforms with multiple heterogeneous accelerators, task tuning and mapping have to cope with device-specific constraints. To address this problem, we present an innovative design flow for the customization and performance optimization of OpenCL applications on heterogeneous parallel platforms. It consists of two phases: 1) a tuning phase that optimizes […]
Jan, 2

Performance comparison of Lattice Boltzmann fluid flow simulation using OpenCL and CUDA frameworks

This paper presents performance comparison, of the lid-driven cavity flow simulation, with Lattice Boltzmann method, example, between CUDA and OpenCL parallel programming frameworks. CUDA is parallel programming model developed by NVIDIA for leveraging computing capabilities of their products. OpenCL is an open, royalty free, standard developed by Khronos group for parallel programming of heterogeneous devices […]
Dec, 30

Characterization of OpenCL on a Scalable FPGA Architecture

The recent release of Altera’s SDK for OpenCL has greatly eased the development of FPGA-based systems. Research have shown performance improvements brought by OpenCL using a single FPGA device. However, to meet the objectives of high performance computing, OpenCL needs to be evaluated using multiple FPGAs. This work has proposed a scalable FPGA architecture for […]
Dec, 30

Extending OmpSs to support CUDA and OpenCL in C, C++ and Fortran Applications

CUDA and OpenCL are the most widely used programming models to exploit hardware accelerators. Both programming models provide a C-based programming language to write accelerator kernels and a host API used to glue the host and kernel parts. Although this model is a clear improvement over a low-level and ad-hoc programming model for each hardware […]
Dec, 20

A Parallel Recursive Approach for Solving All Pairs Shortest Path Problem on GPU using OpenCL

All-pairs shortest path problem(APSP) finds a large number of practical applications in real world. We owe to present a highly parallel and recursive solution for solving APSP problem based on Kleene’s algorithm. The proposed parallel approach for APSP is implemented using an open standard framework OpenCL which provides a development environment for utilizing massive parallel […]
Dec, 16

An Optimized GPU Memory Hierarchy Design for an OpenCL Kernel

With the advent of multi and many-core processors, communication has replaced computation as the performance bottleneck. Most current approaches to the problem try to tolerate memory access latency through a high amount of Thread-Level Parallelism. However, not all applications benefit from such techniques and there is a need to address the weakness of the underlying […]
Dec, 9

Portable OpenCL Out-of-Order Execution Framework for Heterogeneous Platforms

Heterogeneous computing has become a viable option in seeking computing performance, to the side of conventional homogeneous multi-/single-processor approaches. The advantage of heterogeneity is the possibility to choose the best device on the platform for different distinct workloads in the application to gain performance and/or to lower power consumption. The drawback of heterogeneity is the […]
Dec, 5

IPMACC: Open Source OpenACC to CUDA/OpenCL Translator

In this paper we introduce IPMACC, a framework for translating OpenACC applications to CUDA or OpenCL. IPMACC is composed of set of translators translating OpenACC for C applications to CUDA or OpenCL. The framework uses the system compiler (e.g. nvcc) for generating final accelerator’s binary. The framework can be used for extending the OpenACC API, […]
Dec, 3

OpenCL Based High-Quality HEVC Motion Estimation on GPU

This paper presents a high quality H.265/HEVC motion estimation implementation with the cooperation of CPU and GPU. The data dependency from MVP (Motion Vector Predictor) restricts the degree of parallelism on GPU. To overcome the constraint from MVP, we propose to use an estimated MVP on GPU and the accurate MVP to refine the motion […]
Nov, 29

A Framework for Composing High-Performance OpenCL from Python Descriptions

Parallel processors have become ubiquitous; most programmers today have access to parallel hardware such as multi-core processors and graphics processors. This has created an implementation gap, where efficiency programmers with knowledge of hardware details can attain high performance by exploiting parallel hardware, while productivity programmers with application-level knowledge may not understand low-level performance trade-offs. Ideally, […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: