high performance computing on graphics processing units: hgpu.org

Posts

Mar, 31

A preliminary study of OpenCL for accelerating CT reconstruction and image recognition

This study aims to implement and test the NVIDIA’s and AMD’s early OpenCL compatibility implementation release on a back-projection step, the most time consuming part of an FBP reconstruction, and the Haralick’s texture feature extraction algorithm for CT images.

OpenCL

Mar, 31

Application of the OpenCL API for Implementation of the NIPALS Algorithm for Principal Component Analysis of Large Data Sets

An implementation of the nonlinear iterative partial least squares algorithm (NIPALS) was used as a test case for use of OpenCL for computation on a general purpose graphics processing unit (GPGPU) cluster using MPI. Timing results are shown along with results of a model of time required per iteration for defined problem sizes. Various steps […]

OpenCL

Mar, 31

The AES Implantation Based on OpenCL for Multi/many Core Architecture

In this article we present a study on an implementation, named clAES, of the symmetric key cryptography algorithm Advanced Encryption Standard (AES) using the Open Computing Language (OpenCL) emerging standard. We will show a comparison of the results obtained benchmarking clAES on various multi/many core architectures. We will also introduce the basic concepts of AES […]

OpenCL

Mar, 31

Implementation of Smith-Waterman Algorithm in OpenCL for GPUs

In this paper we present an implementation of the Smith-Waterman algorithm. The implementation is done in OpenCL and targets high-end GPUs. This implementation is capable of computing similarity indexes between reference and query sequences. The implementation is designed for the sequence alignment paths calculation. In addition, it is capable of handling very long reference sequences […]

CUDA

•

OpenCL

Mar, 31

OpenCL embedded profile prototype in mobile device

Programmable graphics processing unit (GPU) has over the years become an integral part of today’s computing systems. The GPU use-cases have gradually been extended from graphics towards a wide range of applications. Since the programmable GPU is now making its way to mobile devices, it is interesting to study these new use-cases also there. To […]

OpenCL

Mar, 31

Evaluation Framework for GPU Performance Based on OpenCL Standard

There are many projects focused on performance measurements of GPUs but there is no unifying test framework that could be used for evaluating generic floating point intensive applications. This work describes the testing suite for evaluating GPUs that measures raw performance and numerical precision of a subset of OpenCL operations, and analyzes results obtained from […]

OpenCL

Mar, 31

CUDACL: A tool for CUDA and OpenCL programmers

Graphical Processing Unit (GPU) programming languages are used extensively for general-purpose computations. However, GPU programming languages are at a level of abstraction suitable only for use by expert parallel programmers. This paper presents a new approach through which ‘C’ or Java programmers can access these languages without having to focus on the technical or language-specific […]

CUDA

•

OpenCL

Mar, 31

Exploiting SIMD extensions for linear image processing with OpenCL

The OpenCL framework supports SIMD capabilities available in general purpose processors, which have been used to prospect performance improvements in several applications. In this paper we propose efficient algorithms for linear image processing by exploring the provided SIMD extensions on AMD and Intel processors. The efficiency of the SIMD based computation inferred by the OpenCL […]

OpenCL

Mar, 31

OpenCL – An effective programming model for data parallel computations at the Cell Broadband Engine

Current processor architectures are diverse and heterogeneous. Examples include multicore chips, CPUs and the Cell Broadband Engine (CBE). The recent Open Compute Language (OpenCL) standard aims at efficiency and portability. This paper explores its efficiency when implemented on the CBE, without using CBE-specific features such as explicit asynchronous memory transfers. We based our experiments on […]

CUDA

•

OpenCL

Mar, 30

OpenCL and parallel primitives for digital TV applications

Open Computing Language (OpenCL), which is created to support parallel programming of heterogeneous multicore-processor systems, has a very large potential for high-performance computing and consumer electronics since it provides application programming interfaces (APIs) to help make a portable code that runs across multiple devices. OpenCL is still under development, and it is not clear whether […]

OpenCL

Mar, 30

Accelerated cone beam CT reconstruction based on OpenCL

Open Computing Language (OpenCL) is a fundamental technology for cross-platform parallel programming. The emerging of OpenCL provides portable and efficient access to the power of modern processors. This revolutionary new technology is applied to accelerate the reconstruction of cone beam computed tomography (CBCT) on Graphics Processing Unit (GPU) in this paper. An OpenCL-based implementation of […]

OpenCL

Mar, 30

OpenCL: Make Ubiquitous Supercomputing Possible

Due to the dramatic requirements of 3D games and applications, graphics processing unit (GPU) or general-purpose graphics processing unit (GPGPU) have become required components in the modern computer systems. While these devices enable high parallelism with huge amount of processing elements, the utilization of their capabilities in general scientific applications are still low due to […]

OpenCL

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

A preliminary study of OpenCL for accelerating CT reconstruction and image recognition

Application of the OpenCL API for Implementation of the NIPALS Algorithm for Principal Component Analysis of Large Data Sets

The AES Implantation Based on OpenCL for Multi/many Core Architecture

Implementation of Smith-Waterman Algorithm in OpenCL for GPUs

OpenCL embedded profile prototype in mobile device

Evaluation Framework for GPU Performance Based on OpenCL Standard

CUDACL: A tool for CUDA and OpenCL programmers

Exploiting SIMD extensions for linear image processing with OpenCL

OpenCL – An effective programming model for data parallel computations at the Cell Broadband Engine

OpenCL and parallel primitives for digital TV applications

Accelerated cone beam CT reconstruction based on OpenCL

OpenCL: Make Ubiquitous Supercomputing Possible

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)