5755

Posts

Sep, 24

Analyzing Use of OpenCL on the Cell Broadband Engine and a Proposal for OpenCL Extensions

Current processor architectures are diverse and heterogeneous. Examples include multicore chips, GPUs and the Cell Broadband Engine (CBE). The recent Open Compute Language (OpenCL) standard aims at efficiency and portability. This paper explores its efficiency when implemented on the CBE, without using CBE-specific features such as explicit asynchronous memory transfers. We based our experiments on […]
Sep, 24

Automatic Translation of CUDA to OpenCL and Comparison of Performance Optimizations on GPUs

As an open, royalty-free framework for writing programs that execute across heterogeneous platforms, OpenCL gives programmers access to a variety of data parallel processors including CPUs, GPUs, the Cell and DSPs. All OpenCL-compliant implementations support a core specification, thus ensuring robust functional portability of any OpenCL program. This thesis presents the CUDAtoOpenCL source-to-source tool that […]
Sep, 24

OpenCL: a viable solution for high-performance medical image reconstruction?

Reconstruction of 3-D volumetric data from C-arm CT projections is a computationally demanding task. For interventional image reconstruction, hardware optimization is mandatory. Manufacturers of medical equipment use a variety of high-performance computing (HPC) platforms, like FPGAs, graphics cards, or multi-core CPUs. A problem of this diversity is that many different frameworks and (vendor-specific) programming languages […]
Sep, 24

Single Scattering of Aspherical Particles in DDA Calculations on GPUs Using OpenCL

The global distribution and climatology of ice clouds are among the main uncertainties in climate modelling and prediction. In order to retrieve ice cloud properties from remote sensing measurements, the scattering properties of all cloud ice particle types must be known. The Discrete Dipole Approximation (DDA) simulates scattering of radiation by arbitrarily shaped particles and […]
Sep, 24

Functional Signal Processing with Pure and Faust Using the LLVM Toolkit

Pure and Faust are two functional programming languages useful for programming computer music and other multimedia applications. Faust is a domain-specific language specifically designed for synchronous signal processing, while Pure is a general-purpose language which aims to facilitate symbolic processing of complicated data structures in a variety of application areas. Pure is based on the […]
Sep, 24

Dynamic Data Structures for Taskgraph Scheduling Policies with Applications in OpenCL Accelerators

OpenCL is an emerging open framework for parallel programming in heterogenous systems. OpenCL accelerators need to schedule the execution of submitted jobs with no (or only very imprecise) estimates of execution times, but respecting dependencies among them, which are given in the form of directed acyclic graph. This problem is known as stochastic taskgraph scheduling, […]
Sep, 24

CU2CL: A CUDA-to-OpenCL Translator for Multi-and Many-core Architectures

The use of graphics processing units (GPUs) in high-performance parallel computing continues to become more prevalent, often as part of a heterogeneous system. For years, CUDA has been the de facto programming environment for nearly all general-purpose GPU (GPGPU) applications. In spite of this, the framework is available only on NVIDIA GPUs, traditionally requiring reimplementation […]
Sep, 24

An MDE Approach for Automatic Code Generation from MARTE to OpenCL

Advanced engineering and scientific communities have used parallel programming to solve their large scale complex problems. Achieving high performance is the main advantage for this choice. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Thus, in order to reduce design complexity, we […]
Sep, 23

Integrated Framework for Heterogeneous Embedded Platforms Using OpenCL

The technology community is rapidly moving away from the age of computers and laptops, and is entering the emerging era of hand-held devices. With the rapid development of smart phones, tablets, and pads, there has been widespread adoption of Graphic Processing Units (GPUs) in the embedded space. The hand-held market is now seeing an ever […]
Sep, 23

Embedding OpenCL in C++ for Expressive GPU Programming

We present a high performance GPU programming language, based on OpenCL, that is embedded in C++. Our embedding provides shared data structures, typesafe kernel invocation, and the ability to more naturally interleave CPU and GPU functions, similar to CUDA but with the portability of OpenCL. For expressivity, our language provides an abstraction that releases control […]
Sep, 23

Improving SIMT Efficiency of Global Rendering Algorithms with Architectural Support for Dynamic Micro-Kernels

Wide Single Instruction, Multiple Thread (SIMT)architectures often require a static allocation of thread groups that are executed in lockstep throughout the entire application kernel. Individual thread branching is supported by executing all control flow paths for threads in a thread group and only committing the results of threads on the current control path. While convergence […]
Sep, 23

Accelerating reaction-diffusion simulations with general-purpose graphics processing units

SUMMARY: We present a massively parallel stochastic simulation algorithm (SSA) for reaction-diffusion systems implemented on Graphics Processing Units (GPUs). These are designated chips optimized to process a high number of floating point operations in parallel, rendering them well-suited for a range of scientific high-performance computations. Newer GPU generations provide a high-level programming interface which turns […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: