10659

Posts

Oct, 7

Vectorized OpenCL implementation of numerical integration for higher order finite elements

In our work we analyze computational aspects of the problem of numerical integration in finite element calculations and consider an OpenCL implementation of related algorithms for processors with wide vector registers. As a platform for testing the implementation we choose the PowerXCell processor, being an example of the Cell Broadband Engine (CellBE) architecture. Although the […]
Sep, 28

From OpenCL to Gates: the FFT

The FFT plays a fundamental role in OFDM programmable digital baseband communication systems under the SDR context. The core nature of this algorithm marks it as a primary target for acceleration. Since long frame lengths of the FFT are desirable in order to achieve higher bitrates, the computational complexity becomes even more significant. In this […]
Sep, 27

OpenCL Parallel Programming Development Cookbook

Welcome to the OpenCL Parallel Programming Development Cookbook! Whew, that was more than a mouthful. This book was written by a developer, that’s me, and for a developer, hopefully that’s you. This book will look familiar to some and distinct to others. It is a result of my experience with OpenCL, but more importantly in […]
Sep, 25

OpenCL Task Partitioning in the Presence of GPU Contention

Heterogeneous multi- and many-core systems are increasingly prevalent in the desktop and mobile domains. On these systems it is common for programs to compete with co-running programs for resources. While multi-task scheduling for CPUs is a well-studied area, how to partitioning and map computing tasks onto the hetergeneous system in the presence of GPU contention […]
Sep, 23

Performance of OpenCL

OpenCL is a relatively new standard that supports computation on a variety of parallel architectures. The author was unable to find reliable information about performance of OpenCL programs on CPU’s in comparison to traditional parallel processing standards like OpenMP. This paper describes the results of an experiment that tries to answer the following question: "Which […]
Sep, 22

Accelerating Habanero-Java Programs with OpenCL Generation

The initial wave of programming models for general-purpose computing on GPUs, led by CUDA and OpenCL, has provided experts with low-level constructs to obtain significant performance and energy improvements on GPUs. However, these programming models are characterized by a challenging learning curve for non-experts due to their complex and low-level APIs. Looking to the future, […]
Sep, 20

Workload Analysis and Efficient OpenCL-based Implementation of SIFT Algorithm on a Smartphone

Feature detection and extraction are essential in computer vision applications such as image matching and object recognition. The Scale-Invariant Feature Transform (SIFT) algorithm is one of the most robust approaches to detect and extract distinctive invariant features from images. However, high computational complexity makes it difficult to apply the SIFT algorithm to mobile applications. Recent […]
Sep, 9

Implementation of PDE models of cardiac dynamics on GPUs using OpenCL

Graphical processing units (GPUs) promise to revolutionize scientific computing in the near future. Already, they allow almost real-time integration of simplified numerical models of cardiac tissue dynamics. However, the integration methods that have been developed so far are typically of low order and use single precision arithmetics. In this work, we describe numerical implementation of […]
Sep, 7

Comparison and Analysis of GPU Energy Effciency For CUDA and OpenCL

The use of GPUs for processing large sets of parallelizable data has increased sharply in recent years. As the concept of GPU computing is still relatively young, parameters other than computation time, such as energy eciency, are being overlooked. Two parallel computing platforms, CUDA and OpenCL, provide developers with an interface that they can use […]
Aug, 26

OpenCL programming using Python syntax

We describe ocl, a Python library built on top of pyOpenCL and numpy. It allows programming GPU devices using Python. Python functions which are marked up using the provided decorator, are converted into C99/OpenCL and compiled using the JIT at runtime. This approach lowers the barrier to entry to programming GPU devices since it requires […]
Aug, 26

SystemC simulation on GP-GPUs: CUDA vs. OpenCL

SystemC is a widespread language for developing SoC designs. Unfortunately, most SystemC simulators are based on a strictly sequential scheduler that heavily limits their performance, impacting verification schedules and time-to-market of new designs. Parallelizing SystemC simulation entails a complete re-design of the simulator kernel for the specific target parallel architectures. This paper proposes an automatic […]
Aug, 24

SOCL: An OpenCL Implementation with Automatic Multi-Device Adaptation Support

To fully tap into the potential of today’s heterogeneous machines, offloading parts of an application on accelerators is not sufficient. The real challenge is to build systems where the application would permanently spread across the entire machine, that is, where parallel tasks would be dynamically scheduled over the full set of available processing units. In […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: