high performance computing on graphics processing units: hgpu.org

Posts

Oct, 25

Distributed GPU Password Cracking Research Project

This research project explores the possiblities of intergrating GPU processing power with a network cluster in order to achieve better performance with respect to password cracking. First a literature study is performed on the field of passwords in general, GPGPU computing and distributed computing through means of middleware. With these building blocks, combined with current […]

OpenCL

Oct, 25

Extending Scala with General Purpose GPU Programming

In this report we document an attempt to make it easier to use powerful GPUs by extending the Scala compiler to automatically offload work to the GPU. We benchmark similar code and find that it provides between 2-3 speedup compared to the CPU alone. Finally we discuss ways to improve the extention to offload more […]

OpenCL

Oct, 25

Development of a Flow Solver with Complex Kinetics on the Graphic Processing Units

The current paper reports on the implementation of a numerical solver on the Graphic Processing Units (GPU) to model reactive gas mixture with detailed chemical kinetics. The solver incorporates high-order finite volume methods for solving the fluid dynamical equations coupled with stiff source terms. The chemical kinetics are solved implicitly via an operator-splitting method. We […]

CUDA

Oct, 24

Parallel Implementation of Devanagari Text Line and Word Segmentation Approach on GPU

Fast and accurate algorithms are necessary for Optical Character Recognition (OCR) systems to perform operations on document images such as pre-processing, segmentation, feature extraction, training and testing of classifiers and post processing. Text line and word segmentation are two important steps in any OCR system. Wrong segmentation may affect the accuracy rate of OCR systems. […]

CUDA

Oct, 24

Design and implementation of a high-performance stream-based computing platform on multigenerational GPUs

During this decade, high performance computation demand has been increasing more and more, for example in the field of humanities [1]. Scientists and investigators are in need of high speed and performance environments for their research, which need to perform millions of floating points operations per second [2]. One way to achieve this goal is […]

OpenCL

Oct, 24

Breaking DVB-CSA

CSA (Common Scrambling Algorithm) is used to encrypt digital audio and video streams in DVB (Digital Video Broadcasting). This is commonly used to limit access to this content to paying customers. In this paper, we present a practical attack (time memory trade-off) against CSA, that can be used to recover the ciphers key and decrypt […]

CUDA

•

OpenCL

Oct, 24

ASAMgpu V1.0-a moist fully compressible atmospheric model using graphics processing units (GPUs)

In this work the three dimensional compressible moist atmospheric model ASAMgpu is presented. The calculations are done using graphics processing units (GPUs). To ensure platform independence OpenGL and GLSL is used, with that the model runs on any hardware supporting fragment shaders. The MPICH2 library enables interprocess communication allowing the usage of more than one […]

OpenGL

Oct, 24

A Hybrid Software Framework for the GPU Acceleration of Multi-Threaded Monte Carlo Applications

Monte Carlo simulations are extensively used in wide of application areas. Although the basic framework of these is simple, they can be extremely computationally intensive. In this paper we present a software framework partitions a generic Monte Carlo simulation into two asynchronous parts: (a) a threaded, GPU-accelerated pseudo-random number generator (or producer), and (b) a […]

OpenCL

Oct, 24

APTCC: Auto Parallelizing Translator From C To CUDA

This paper proposes APTCC, Auto Parallelizing Translator from C to CUDA, a translator from C code to CUDA C without any directives. CUDA C is a programming language for general purpose GPU (GPGPU). CUDA C requires us a special programming manner differently from C. Although there are several pieces of research to reduce this diffculty, […]

CUDA

Oct, 24

Using OpenCL for Implementing Simple Parallel Graph Algorithms

For the typical graph algorithms encountered most frequently in practice (such as those introduced in typical entry-level algorithms courses: graph searching/traversals, shortest paths problems, strongly connected components and minimum spanning trees) we want to consider practical non-sequential platforms such as the emergence of cost effective General-Purpose computation on Graphics Processing Units (GPGPU). In this paper […]

CUDA

•

OpenCL

Oct, 24

Democratizing General Purpose GPU Programming through OpenCL and Scala

General Purpose GPU programming has the potential to increase the speed with which many computations can be done. We show a number of examples of such improvements, investigate how one can benchmark different implementations of GPGPU programming paradigms and how one can measure the productivity of programmers. Finally we implement and describe a simple toolkit […]

OpenCL

Oct, 24

The MOSIX Virtual OpenCL (VCL) Cluster Platform

Heterogeneous computing systems can dramatically increase the performance of parallel applications on clusters. Currently, applications that utilize GPU and APU devices, run their device-specific code only on devices of the same computer were the application runs. This paper presents the MOSIX Virtual OpenCL (VCL) cluster platform that can run unmodified OpenCL applications transparently on clusters […]

OpenCL

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Distributed GPU Password Cracking Research Project

Extending Scala with General Purpose GPU Programming

Development of a Flow Solver with Complex Kinetics on the Graphic Processing Units

Parallel Implementation of Devanagari Text Line and Word Segmentation Approach on GPU

Design and implementation of a high-performance stream-based computing platform on multigenerational GPUs

Breaking DVB-CSA

ASAMgpu V1.0-a moist fully compressible atmospheric model using graphics processing units (GPUs)

A Hybrid Software Framework for the GPU Acceleration of Multi-Threaded Monte Carlo Applications

APTCC: Auto Parallelizing Translator From C To CUDA

Using OpenCL for Implementing Simple Parallel Graph Algorithms

Democratizing General Purpose GPU Programming through OpenCL and Scala

The MOSIX Virtual OpenCL (VCL) Cluster Platform

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)