high performance computing on graphics processing units: hgpu.org

Posts

Jul, 17

Focus measurement on programmable graphics hardware for all in-focus rendering from light fields

This paper deals with a method for interactive rendering of photorealistic images, which is a fundamental technology in the field of virtual reality. Since the latest graphics processing units (GPUs) are programmable, they are expected to be useful for various applications including numerical computation and image processing. This paper proposes a method for focus measurement […]

Jul, 17

Speedup of Fuzzy Clustering Through Stream Processing on Graphics Processing Units

As the number of data points, feature dimensionality, and number of centers for clustering algorithms increase, computational tractability becomes a problem. The fuzzy c-means has a large degree of inherent algorithmic parallelism that modern CPU architectures do not exploit. Many pattern recognition algorithms can be sped up on a graphics processing unit (GPU) as long […]

Jul, 17

Feature-based speed limit sign detection using a graphics processing unit

In this study we test the idea of using a graphics processing unit (GPU) as an embedded co-processor for real-time detection of European Union (EU) speed-limit signs. The input to the system is a set of grayscale videos recorded from a forward-facing camera mounted in a vehicle. We introduce a new technique for implementing the […]

CUDA

•

OpenGL

Jul, 17

Mars: Accelerating MapReduce with Graphics Processors

We design and implement Mars, a MapReduce runtime system accelerated with graphics processing units (GPUs). MapReduce is a simple and flexible parallel programming paradigm originally proposed by Google, for the ease of large-scale data processing on thousands of CPUs. Compared with CPUs, GPUs have an order of magnitude higher computation power and memory bandwidth. However, […]

CUDA

Jul, 17

Towards a unified framework for rapid 3D computed tomography on commodity GPUs

The task of reconstructing an object from its projections via tomographic methods is a time-consuming process due to the vast complexity of the data. For this reason, manufacturers of equipment for computed tomography (CT), both medical and industrial, rely mostly on special ASICs to obtain the fast reconstruction times required in clinical, industrial, and security […]

Jul, 17

Parallel implementation of endmember extraction algorithms using NVidia graphical processing units

Spectral mixture analysis is an important task for remotely sensed hyperspectral data interpretation. In spectral unmixing, both the determination of spectrally pure signatures (endmembers) and the unmixing process that interprets mixed pixels as combinations of endmembers are computationally expensive procedures. An exciting recent development in the field of commodity computing is the emergence of programmable […]

CUDA

Jul, 17

Quality comparison and acceleration for digital hologram generation method based on segmentation

A holographic fringe pattern generation methods is based on Fraunhofer diffraction and subsequent segmentation and approximation of the fringe pattern. Several modifications of the original algorithm are already proposed to improve the quality of reconstructions. We compare the quality of to the reconstructed images from different versions of this algorithm by taking the reconstructions from […]

CUDA

Jul, 16

Full-Parallax Hologram Synthesis of Triangular Meshes using a Graphical Processing Unit

Application of the GPU to the computer generated holography is a topic of research for some time. While the majority of authors aim on performance, we aim on visual aspects. We present a new approach that is capable to synthesise a hologram of a scene described by triangles using the GPU and it is capable […]

Jul, 16

High-performance bankruptcy prediction model using Graphics Processing Units

In recent years the the potential and programmability of Graphics Processing Units (GPU) has raised a note-worthy interest in the research community for applications that demand high-computational power. In particular, in financial applications containing thousands of high-dimensional samples, machine learning techniques such as neural networks are often used. One of their main limitations is that […]

Jul, 16

Uniform partitioning of Monte Carlo radiosity on GPUs

The radiosity method permits the obtaining of high quality images through the evaluation of the global illumination of the scene. The computational complexity and the memory requirements of the algorithm are the main problems when a large scene has to be processed. To reduce the memory requirements, Monte Carlo radiosity method is often used. In […]

CUDA

Jul, 14

ForOpenCL: Transformations Exploiting Array Syntax in Fortran for Accelerator Programming

Emerging GPU architectures for high performance computing are well suited to a data-parallel programming model. This paper presents preliminary work examining a programming methodology that provides Fortran programmers with access to these emerging systems. We use array constructs in Fortran to show how this infrequently exploited, standardized language feature is easily transformed to lower-level accelerator […]

OpenCL

Jul, 14

Real-time, fast radio transient searches with GPU de-dispersion

The identification, and subsequent discovery, of fast radio transients through blind-search surveys requires a large amount of processing power, in worst cases scaling as $mathcal{O}(N^3)$. For this reason, survey data are generally processed offline, using high-performance computing architectures or hardware-based designs. In recent years, graphics processing units have been extensively used for numerical analysis and […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Focus measurement on programmable graphics hardware for all in-focus rendering from light fields

Speedup of Fuzzy Clustering Through Stream Processing on Graphics Processing Units

Feature-based speed limit sign detection using a graphics processing unit

Mars: Accelerating MapReduce with Graphics Processors

Towards a unified framework for rapid 3D computed tomography on commodity GPUs

Parallel implementation of endmember extraction algorithms using NVidia graphical processing units

Quality comparison and acceleration for digital hologram generation method based on segmentation

Full-Parallax Hologram Synthesis of Triangular Meshes using a Graphical Processing Unit

High-performance bankruptcy prediction model using Graphics Processing Units

Uniform partitioning of Monte Carlo radiosity on GPUs

ForOpenCL: Transformations Exploiting Array Syntax in Fortran for Accelerator Programming

Real-time, fast radio transient searches with GPU de-dispersion

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)