high performance computing on graphics processing units: hgpu.org

Posts

Oct, 20

Hierarchical Exploration of Volumes Using Multilevel Segmentation of the Intensity-Gradient Histograms

Visual exploration of volumetric datasets to discover the embedded features and spatial structures is a challenging and tedious task. In this paper we present a semi-automatic approach to this problem that works by visually segmenting the intensitygradient 2D histogram of a volumetric dataset into an exploration hierarchy. Our approach mimics user exploration behavior by analyzing […]

CUDA

Oct, 20

Empirical analysis of a parallel data mining algorithm on a graphic processor

In this thesis, we analyze in an empirical way a different approach of the algorithm SPAM (Sequential PAttern Mining using A Bitmap Representation) made by J. Ayres, J. Gehrke, T. Yiu and J. Flannick from Cornell University, exploiting GPUs. SPAM is a novel approach for FSM (Frequent Sequence Mining) where the algorithm is not looking […]

CUDA

Oct, 20

Research on DSP-GPU Heterogeneous Computing System

In this paper, DSP-GPU heterogeneous computing system is studied and the system architecture is designed. The task scheduling model is analyzed and the discrete particle swarm optimization algorithm is used for the DSP-GPU heterogeneous computing. The communication framework between DSP and GPU is designed for the heterogeneous computing.

OpenCL

Oct, 19

A short guide to CUDA C: For physicists with multi-core graphics cards

The purpose of this guide is to give a quick introduction to CUDA C, NVIDIA’s extension for the C programming language to allow running code parallely on graphics cards. Moreover, the technique is applied to problems in computational physics, namely the generation of random numbers and simulations of the q-state Potts model. If you seek […]

CUDA

Oct, 19

A Modular Framework for Deformation and Fracture using GPU Shaders

Advances in the graphical realism of modern video games have been achieved mainly through the development of the GPU (Graphics Processing Unit), providing a dedicated graphics co-processor and framebuffer. The most recent GPU’s are extremely capable and so flexible that it is now possible to implement a wide range of algorithms on graphics hardware that […]

Oct, 19

Accelerated GPU Powered Methods for Auditing Security of Wireless Networks Using Probabilistic Password Generation

The main goal of this paper is to discuss new & faster methods for testing the strength of security used in today’s wireless networks. This paper discusses probabilistic password generation methodsfor testing the real security of networks protected by WPA/WPA2 PSK (Pre-shared key) standard (also known as Personal mode). The main advantage of using those […]

CUDA

Oct, 19

A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data

The main advantage of a distributed computing system over standalone computer is an ability to share the workload between cores, processors and computers. In our paper we present a hybrid cluster system – a novel computing architecture with multi-core CPUs working together with many-core GPUs. It integrates two types of CPU, i.e., Intel and AMD […]

OpenCL

Oct, 19

A Novel Learning Algorithm for Bayesian Network and Its Efficient Implementation on GPU

Computational inference of causal relationships underlying complex networks, such as gene-regulatory pathways, is NP-complete due to its combinatorial nature when permuting all possible interactions. Markov chain Monte Carlo (MCMC) has been introduced to sample only part of the combinations while still guaranteeing convergence and traversability, which therefore becomes widely used. However, MCMC is not able […]

CUDA

Oct, 18

Comparative Evaluation of Binary Features

Performance evaluation of salient features has a long-standing tradition in computer vision. In this paper, we fill the gap of evaluation for the recent wave of binary feature descriptors, which aim to provide robustness while achieving high computational efficiency. We use established metrics to embed our assessment into the body of existing evaluations, allowing us […]

Oct, 18

Computing Reachable Sets via Barrier Methods on SIMD Architectures

We consider the problem of computing reachable sets of ODE-based control systems parallely on CUDA hardware. To this end, we modify an existing algorithm based on solving optimal control problems. The idea is to simplify the optimal control problems to pure feasibility problems instead of minimizing an objective function. We show that an interior point […]

CUDA

Oct, 18

Strain Visualization of Ultra Sound Signals Processed by General Purpose Graphic Process Unit

Medical Imaging is a technique aimed to develop tools to improve medical diagnosis and procedures. Medical imaging applications use different types of signals to create images of the inside of the human body. It is a very important technique that decreases the complication rate during procedures and provides with more information for more accurate diagnosis. […]

CUDA

Oct, 18

Efficient 2D Software Rendering

The market of computer graphics is dominated by GPU based technologies. However today’s fast central processing units (CPU) based on modern architectural design offer new opportunities in the field of classical software rendering. Because the technological development of the GPU architecture almost reached the limits in the field of the programming model, the CPU-based solutions […]

OpenGL

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Hierarchical Exploration of Volumes Using Multilevel Segmentation of the Intensity-Gradient Histograms

Empirical analysis of a parallel data mining algorithm on a graphic processor

Research on DSP-GPU Heterogeneous Computing System

A short guide to CUDA C: For physicists with multi-core graphics cards

A Modular Framework for Deformation and Fracture using GPU Shaders

Accelerated GPU Powered Methods for Auditing Security of Wireless Networks Using Probabilistic Password Generation

A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data

A Novel Learning Algorithm for Bayesian Network and Its Efficient Implementation on GPU

Comparative Evaluation of Binary Features

Computing Reachable Sets via Barrier Methods on SIMD Architectures

Strain Visualization of Ultra Sound Signals Processed by General Purpose Graphic Process Unit

Efficient 2D Software Rendering

Recent source codes

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

PELSI: Power-Efficient Layer-Switched Inference

Ouroboros: Virtualized Queues for dynamic memory management

MSCCL++: A GPU-driven communication stack for scalable AI applications

Benchmark compute shader of Unity against InteropUnityCUDA

Most viewed papers (last 30 days)