8176

Posts

Sep, 8

Lost in Translation: Challenges in Automating CUDA-to-OpenCL Translation

The use of accelerators in high-performance computing is increasing. The most commonly used accelerator is the graphics processing unit (GPU) because of its low cost and massively parallel performance. The two most common programming environments for GPU accelerators are CUDA and OpenCL. While CUDA runs natively only on NVIDIA GPUs, OpenCL is an open standard […]
Aug, 26

Is OpenCL a suitable platform for algorithm development in health care systems?

This thesis reviews if OpenCL is a suitable and cost effective platform for algorithm development in health care systems. Aspects such as maintainability, performance, portability and integration with high-level languages (in this case Python) are analyzed. The review is done by implementing one part of a dose calculation algorithm that is complex enough to provide […]
Aug, 18

An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture

In this paper, we present a comparative performance analysis of different parallel sorting algorithms: Bitonic sort and Parallel Radix Sort. In order to study the interaction between the algorithms and architecture, we implemented both the algorithms in OpenCL and compared its performance with Quick Sort algorithm, the fastest algorithm. In our simulation, we have used […]
Aug, 9

OpenCL-based Algorithm for Heat Load Modelling of District Heating System

This paper presents a parallel approach to estimate the parameters in the heat loading of a district heating system by use of the traditional particle swarm optimisation (TPSO) on the Graphic Processing Unit (GPU) using OpenCL. The running time of the algorithm is greatly reduced compared to running on CPU. The heat load is approximated […]
Aug, 6

Comparison of OpenMP & OpenCL Parallel Processing Technologies

This paper presents a comparison of OpenMP and OpenCL based on the parallel implementation of algorithms from various fields of computer applications. The focus of our study is on the performance of benchmark comparing OpenMP and OpenCL. We observed that OpenCL programming model is a good option for mapping threads on different processing cores. Balancing […]
Aug, 1

High-Level Manipulation of OpenCL-Based Subvectors and Submatrices

High-level C++ proxies for the convenient manipulation of subvectors and submatrices on OpenCL-enabled devices are introduced. It is demonstrated that the programming convenience of standard host-based code can be retained using native C++ language features only, even if massively parallel computing architectures such as graphics processing units are employed. The required modifications of the underlying […]
Jul, 26

SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters

In this paper, we propose SnuCL, an OpenCL framework for heterogeneous CPU/GPU clusters. We show that the original OpenCL semantics naturally fits to the heterogeneous cluster programming environment, and the framework achieves high performance and ease of programming. The target cluster architecture consists of a designated, single host node and many compute nodes. They are […]
Jul, 16

Distributed OpenCL: a platform for distributed, heterogeneous computing for domain scientists

It is possible to purchase, for as little as $10,000, a cluster of computers with the capability to rival the supercomputers of only a few years ago. Now, users that have little to no experience developing distributed applications or managing a cluster are in a position to do so. To allow domain scientists to effectively […]
Jul, 15

Implementing a Code Generator for Fast Matrix Multiplication in OpenCL on the GPU

This paper presents results of an implementation of code generator for fast general matrix multiply (GEMM) kernels. When a set of parameters is given, the code generator produces the corresponding GEMM kernel written in OpenCL. The produced kernels are optimized for high-performance implementation on GPUs from AMD. Access latencies to GPU global memory is the […]
Jul, 15

Distributed OpenCL Distributing OpenCL Platform on Network Scale

This paper presents a framework that extends OpenCL by distributing computing process to many computing resources connected via network and enables the computing resources to run in parallel. Using JSON RPC (Remote Procedure Call technique relying on JavaScript Object Notation) in communication layer, Distributed OpenCL framework provides platform and operating system independency. Using this framework, […]
Jul, 11

Invitation to a Standard Programming Interface for Massively Parallel Computing Environment: OpenCL

Multicore/manycore architecture accelerates demand for a new programming environment to utilize the massive processors integrated in an LSI. GPU (Graphics Processing Unit) is one of the typical hardware environments. The programming environments on GPU are traditionally vendor-/hardware-specific, where complicate the management of uniform programs that access computing resources of the massively parallel platform. The recently […]
Jul, 11

Geometric Algebra enhanced Precompiler for C++ and OpenCL

The focus of the this work is a simplified integration of algorithms expressed in Geometric Algebra (GA) in modern high level computer languages, namely C++, OpenCL and CUDA. A high runtime performance in terms of GA is achieved using symbolic simplification and code generation by a Precompiler that is directly integrated into CMake-based build toolchains.

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: