Posts
Nov, 6
Graphics Hardware-Based Level-Set Method for Interactive Segmentation and Visualization
This paper presents an efficient graphics hardware-based method to segment and visualize level-set surfaces as interactive rates. Our method is composed of memory manager, level-set solver, and volume renderer. The memory manager which performs in CPU generates page table, inverse page table and available page stack as well as process the activation and inactivation of […]
Nov, 6
PacketShader: a GPU-accelerated software router
We present PacketShader, a high-performance software router framework for general packet processing with Graphics Processing Unit (GPU) acceleration. PacketShader exploits the massively-parallel processing power of GPU to address the CPU bottleneck in current software routers. Combined with our high-performance packet I/O engine, PacketShader outperforms existing software routers by more than a factor of four, forwarding […]
Nov, 5
An Introduction to GPU Accelerated Surgical Simulation
Modern graphics processing units (GPUs) have recently become fully programmable. Thus a powerful and cost-efficient new computational platform for surgical simulations has emerged. A broad selection of publications has shown that scientific computations obtain a significant speedup if ported from the CPU to the GPU. To take advantage of the GPU however, one must understand […]
Nov, 5
Multi-Level Graph Layout on the GPU
This paper presents a new algorithm for force directed graph layout on the GPU. The algorithm, whose goal is to compute layouts accurately and quickly, has two contributions. The first contribution is proposing a general multi-level scheme, which is based on spectral partitioning. The second contribution is computing the layout on the GPU. Since the […]
Nov, 5
GPU’s for event reconstruction in the FairRoot framework
FairRoot is the simulation and analysis framework used by CBM and PANDA experiments at FAIR/GSI. The use of graphics processor units (GPUs) for event reconstruction in FairRoot will be presented. The fact that CUDA (Nvidia’s Compute Unified Device Architecture) development tools work alongside the conventional C/C++ compiler, makes it possible to mix GPU code with […]
Nov, 5
The GPU as numerical simulation engine
Many computer graphics applications require high-intensity numerical simulation. The question arises whether such computations can be performed efficiently on the GPU, which has emerged as a full function streaming processor with high floating point performance. We show in this paper that this is indeed the case using two basic, broadly useful, computational kernels as examples. […]
Nov, 5
Using GPUs for Machine Learning Algorithms
Using dedicated hardware to do machine learning typically ends up in disaster because of cost, obsolescence, and poor software. The popularization of Graphic Processing Units (GPUs), which are now available on every PC, provides an attractive alternative. We propose a generic 2-layer fully connected neural network GPU implementation which yields over 3X speedup for both […]
Nov, 5
Clustering billions of data points using GPUs
In this paper, we report our research on using GPUs to accelerate clustering of very large data sets, which are common in today’s real world applications. While many published works have shown that GPUs can be used to accelerate various general purpose applications with respectable performance gains, few attempts have been made to tackle very […]
Nov, 5
Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing
This paper presents an effective scheme for clustering a huge data set using a PC cluster system, in which each PC is equipped with a commodity programmable graphics processing unit (GPU). The proposed scheme is devised to achieve three-level hierarchical parallel processing of massive data clustering. The divide-and-conquer approach to parallel data clustering is employed […]
Nov, 5
On Dynamic Load Balancing on Graphics Processors
To get maximum performance on the many-core graphics processors it is important to have an even balance of the workload so that all processing units contribute equally to the task at hand. This can be hard to achieve when the cost of a task is not known beforehand and when new sub-tasks are created dynamically […]
Nov, 5
Coarse grain parallelization of evolutionary algorithms on GPGPU cards with EASEA
This paper presents a straightforward implementation of a standard evolutionary algorithm that evaluates its population in parallel on a GPGPU card. Tests done on a benchmark and a real world problem using an old NVidia 8800GTX card and a newer but not top of the range GTX260 card show a roughly 30x (resp. 100x) speedup […]
Nov, 5
MPI within a GPU
GPUs offer high-performance floating-point computation at commodity prices, but their usage is hindered by programming models which expose the user to irregularities in the current shared-memory environments and require learning new interfaces and semantics. This thesis will demonstrate that the message-passing paradigm can be conceptually cleaner than the current data-parallel models for programming GPUs because […]