high performance computing on graphics processing units: hgpu.org

Posts

Oct, 20

Ray Tracing of Volumetric Data in Real Time

Graphics processors of today are highly efficient, parallel processors, capable of rendering complex scenes consisting of millions of polygons on the screen each and every second. They are highly specialized towards game graphics and similar, polygon based graphics. In the past, however, they have not been very efficient at rendering volumetric data, and especially not […]

OpenCL

Oct, 20

The impact of GPU/Multicore in Signal Processing: a quantitative approach

This paper presents a meaningful practical performance comparison between the last generation of Graphics Processing Units (GPUs) and the last generation multi-core CPUs when they are used to solve given Signal Processing algorithms. Two kinds of tests were considered: when GPU pre-designed computational libraries were available, and when the GPU code was developed by the […]

CUDA

Oct, 20

Hierarchical Exploration of Volumes Using Multilevel Segmentation of the Intensity-Gradient Histograms

Visual exploration of volumetric datasets to discover the embedded features and spatial structures is a challenging and tedious task. In this paper we present a semi-automatic approach to this problem that works by visually segmenting the intensitygradient 2D histogram of a volumetric dataset into an exploration hierarchy. Our approach mimics user exploration behavior by analyzing […]

CUDA

Oct, 20

Empirical analysis of a parallel data mining algorithm on a graphic processor

In this thesis, we analyze in an empirical way a different approach of the algorithm SPAM (Sequential PAttern Mining using A Bitmap Representation) made by J. Ayres, J. Gehrke, T. Yiu and J. Flannick from Cornell University, exploiting GPUs. SPAM is a novel approach for FSM (Frequent Sequence Mining) where the algorithm is not looking […]

CUDA

Oct, 20

Research on DSP-GPU Heterogeneous Computing System

In this paper, DSP-GPU heterogeneous computing system is studied and the system architecture is designed. The task scheduling model is analyzed and the discrete particle swarm optimization algorithm is used for the DSP-GPU heterogeneous computing. The communication framework between DSP and GPU is designed for the heterogeneous computing.

OpenCL

Oct, 19

A short guide to CUDA C: For physicists with multi-core graphics cards

The purpose of this guide is to give a quick introduction to CUDA C, NVIDIA’s extension for the C programming language to allow running code parallely on graphics cards. Moreover, the technique is applied to problems in computational physics, namely the generation of random numbers and simulations of the q-state Potts model. If you seek […]

CUDA

Oct, 19

A Modular Framework for Deformation and Fracture using GPU Shaders

Advances in the graphical realism of modern video games have been achieved mainly through the development of the GPU (Graphics Processing Unit), providing a dedicated graphics co-processor and framebuffer. The most recent GPU’s are extremely capable and so flexible that it is now possible to implement a wide range of algorithms on graphics hardware that […]

Oct, 19

Accelerated GPU Powered Methods for Auditing Security of Wireless Networks Using Probabilistic Password Generation

The main goal of this paper is to discuss new & faster methods for testing the strength of security used in today’s wireless networks. This paper discusses probabilistic password generation methodsfor testing the real security of networks protected by WPA/WPA2 PSK (Pre-shared key) standard (also known as Personal mode). The main advantage of using those […]

CUDA

Oct, 19

A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data

The main advantage of a distributed computing system over standalone computer is an ability to share the workload between cores, processors and computers. In our paper we present a hybrid cluster system – a novel computing architecture with multi-core CPUs working together with many-core GPUs. It integrates two types of CPU, i.e., Intel and AMD […]

OpenCL

Oct, 19

A Novel Learning Algorithm for Bayesian Network and Its Efficient Implementation on GPU

Computational inference of causal relationships underlying complex networks, such as gene-regulatory pathways, is NP-complete due to its combinatorial nature when permuting all possible interactions. Markov chain Monte Carlo (MCMC) has been introduced to sample only part of the combinations while still guaranteeing convergence and traversability, which therefore becomes widely used. However, MCMC is not able […]

CUDA

Oct, 18

Comparative Evaluation of Binary Features

Performance evaluation of salient features has a long-standing tradition in computer vision. In this paper, we fill the gap of evaluation for the recent wave of binary feature descriptors, which aim to provide robustness while achieving high computational efficiency. We use established metrics to embed our assessment into the body of existing evaluations, allowing us […]

Oct, 18

Computing Reachable Sets via Barrier Methods on SIMD Architectures

We consider the problem of computing reachable sets of ODE-based control systems parallely on CUDA hardware. To this end, we modify an existing algorithm based on solving optimal control problems. The idea is to simplify the optimal control problems to pure feasibility problems instead of minimizing an objective function. We show that an interior point […]

CUDA