May, 12

Geometric Algebra Enhanced Precompiler for C++, OpenCL and Mathematica’s OpenCLLink

The focus of this work is a simplified integration of algorithms expressed in Geometric Algebra (GA) into modern high level computer languages, namely C++, OpenCL and CUDA. A high runtime performance in terms of GA is achieved using symbolic simplification and code generation by a precompiler that is directly integrated into CMake-based build toolchains. Finally, […]
Apr, 17

Energy-Efficient FPGA Implementation for Binomial Option Pricing Using OpenCL

Energy efficiency of financial computations is a performance criterion that can no longer be dismissed, and is as crucial as raw acceleration and accuracy of the solution. In order to reduce the energy consumption of financial accelerators, FPGAs offer a good compromise with low power consumption and high parallelism. However, designing and prototyping an application […]
Apr, 9

A two-level task scheduler on Multiple DSP system for OpenCL

This paper addresses the problem that multiple DSP system doesn’t support OpenCL programming. With the compiler, runtime and the kernel scheduler proposed, an OpenCL application becomes portable not only between multiple CPU and GPU, but also between embedded multiple DSP systems. Firstly, the LLVM compiler was imported for source-to-source translation in which the translated source […]
Mar, 23

Computer Vision Accelerators for Mobile Systems based on OpenCL GPGPU Co-Processing

In this paper, we present an OpenCL-based heterogeneous implementation of a computer vision algorithm — image inpainting-based object removal algorithm — on mobile devices. To take advantage of the computation power of the mobile processor, the algorithm workflow is partitioned between the CPU and the GPU based on the profiling results on mobile devices, so […]
Mar, 18

Optimising OpenCL kernels for the ARM Mali-T600 GPUs

OpenCL is a relatively young industry-backed standard API that aims to provide functional portability across systems equipped with computational accelerators such as GPUs: a standard-conforming OpenCL program can be executed on any standard-conforming OpenCL implementation. OpenCL, however, does not address the issue of performance portability: transforming an OpenCL program to achieve higher performance on one […]
Mar, 10

OpenCL-Accelerated Simplified General Perturbations 4 Algorithm

The number of space objects such as satellites, spacecraft, and debris are increasing significantly, and so is the need for tracking them for security and collision avoidance purposes. In this context, as parallelism is becoming a new paradigm, the need of implementing high performance propagators remain unmet. For this, we implemented Simplified General Perturbations No. […]
Feb, 28

Extending a Run-time Resource Management framework to support OpenCL and Heterogeneous Systems

From Mobile to High-Performance Computing (HPC) systems, performance and energy efficiency are becoming always more challenging requirements. In this regard, heterogeneous systems, made by a general-purpose processor and one or more hardware accelerators, are emerging as affordable solutions. However, the effective exploitation of such platforms requires specific programming languages, like for instance OpenCL, and suitable […]
Feb, 27

Face Recognition Using OpenCL

Face recognition is the biometric identification of human’s face and matching the image against a library of known faces. The algorithm used to simulate the above is Eigen faces algorithm. The software which is been proposed to implement is Open CL. Open CL (Open Computing Language) is an open standard for general purpose parallel programming […]
Feb, 23

Real Time Face Detection on GPU Using OpenCL

This paper presents a novel approach for real time face detection using heterogeneous computing. The algorithm uses local binary pattern (LBP) as feature vector for face detection. OpenCL is used to accelerate the code using GPU[1]. Illuminance invariance is achieved using gamma correction and Difference of Gaussian(DOG) to make the algorithm robust against varying lighting […]
Feb, 15

Accelerator Aware MPI Micro-benchmarking using CUDA, OpenACC and OpenCL

Recently MPI implementations have been extended to support accelerator devices, Intel Many Integrated Core (MIC) and nVidia GPU. This has been accomplished by changes to different levels of the software stacks and MPI implementations. In order to evaluate performance and scalability of accelerator aware MPI libraries, we developed portable micro-benchmarks to identify factors that influence […]
Jan, 23

On the Portability of the OpenCL Dwarfs on Fixed and Reconfigurable Parallel Platforms

The proliferation of heterogeneous computing systems presents the parallel computing community with the challenge of porting legacy and emerging applications to multiple processors with diverse programming abstractions. OpenCL is a vendor-agnostic and industry-supported programming model that offers code portability on heterogeneous platforms, allowing applications to be developed once and deployed "anywhere". In this paper, we […]
Jan, 23

clpeak – peak performance of your opencl device

clpeak is a benchmarking tool intended toward developers to fine-tune opencl kernels for a particular device/class of device. It calculates bandwidth & compute performance for different vector-widths of a datatype, say float, float4. Traditionally it is recommended to use scalar code and we expect opencl compiler to auto-vectorize it. But, most of the times compiler […]
Page 20 of 103« First...10...1819202122...304050...Last »

* * *

* * *

Featured events

HGPU group © 2010-2018 hgpu.org

All rights belong to the respective authors

Contact us: