Posts
Apr, 22
Accelerating the numerical simulation of magnetic field lines in tokamaks using the GPU
trip3d is a field line simulation code that numerically integrates a set of nonlinear magnetic field line differential equations. The code is used to study properties of magnetic islands and stochastic or chaotic field line topologies that are important for designing non-axisymmetric magnetic perturbation coils for controlling plasma instabilities in future machines. The code is […]
Apr, 22
Scalable Clustering Using Graphics Processors
We present new algorithms for scalable clustering using graphics processors. Our basic approach is based on k-means. By changing the order of determining object labels, and exploiting the high computational power and pipeline of graphics processing units (GPUs) for distance computing and comparison, we speed up the k-means algorithm substantially. We introduce two strategies for […]
Apr, 22
GPU accelerated simulations of 3D deterministic particle transport using discrete ordinates method
Graphics Processing Unit (GPU), originally developed for real-time, high-definition 3D graphics in computer games, now provides great faculty in solving scientific applications. The basis of particle transport simulation is the time-dependent, multi-group, inhomogeneous Boltzmann transport equation. The numerical solution to the Boltzmann equation involves the discrete ordinates (Sn) method and the procedure of source iteration. […]
Apr, 22
AMD Fusion Developer Summit 2011, AFDS 2011
Heterogeneous computing is moving into the mainstream, and a broader range of applications are already on the way. As the provider of world-class CPUs, GPUs, and APUs, AMD offers unique insight into these technologies and how they interoperate. Attend the AMD Fusion Developer Summit to learn about the opportunities that lie ahead.
Apr, 21
Pretty Good Accuracy in Matrix Multiplication with GPUs
With systems such as Road Runner, there is a trend in super computing to offload parallel tasks to special purpose co-processors, composed of many relatively simple scalar processors. The cheaper commodity class equivalent of such a processor would be the graphics card, potentially offering super computer power within the confines of a desktop PC. Graphics […]
Apr, 21
Using graphics processors to accelerate the computation of the matrix inverse
We study the use of massively parallel architectures for computing a matrix inverse. Two different algorithms are reviewed, the traditional approach based on Gaussian elimination and the Gauss-Jordan elimination alternative, and several high performance implementations are presented and evaluated. The target architecture is a current general-purpose multicore processor (CPU) connected to a graphics processor (GPU). […]
Apr, 21
Fast Variable Center-Biased Windowing for High-Speed Stereo on Programmable Graphics Hardware
We present a high-speed dense stereo algorithm that achieves both good quality results and very high disparity estimation throughput on the graphics processing unit (GPU). The key idea is a variable center-biased windowing approach, enabling an adaptive selection of the most suitable support patterns with varying sizes and shapes. As the fundamental construct for variable […]
Apr, 21
Automatically generating and tuning GPU code for sparse matrix-vector multiplication from a high-level representation
We propose a system-independent representation of sparse matrix formats that allows a compiler to generate efficient, system-specific code for sparse matrix operations. To show the viability of such a representation we have developed a compiler that generates and tunes code for sparse matrix-vector multiplication (SpMV) on GPUs. We evaluate our framework on six state-of-the-art matrix […]
Apr, 21
Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures
Graphics processors are increasingly used in scientific applications due to their high computational power, which comes from hardware with multiple-level parallelism and memory hierarchy. Sparse matrix computations frequently arise in scientific applications, for example, when solving PDEs on unstructured grids. However, traditional sparse matrix algorithms are difficult to efficiently parallelize for GPUs due to irregular […]
Apr, 21
Assessment of GPU computational enhancement to a 2D flood model
This paper presents a study of the computational enhancement of a Graphics Processing Unit (GPU) enabled 2D flood model. The objectives are to demonstrate the significant speedup of a new GPU-enabled full dynamic wave flood model and to present the effect of model spatial resolution on its speedup. A 2D dynamic flood model based on […]
Apr, 21
Solving knapsack problems on GPU
A parallel implementation via CUDA of the dynamic programming method for the knapsack problem on NVIDIA GPU is presented. A GTX 260 card with 192 cores (1.4GHz) is used for computational tests and processing times obtained with the parallel code are compared to the sequential one on a CPU with an Intel Xeon 3.0GHz. The […]
Apr, 21
Auto-Tuning CUDA Parameters for Sparse Matrix-Vector Multiplication on GPUs
Graphics Processing Unit (GPU) has become an attractive coprocessor for scientific computing due to its massive processing capability. The sparse matrix-vector multiplication (SpMV) is a critical operation in a wide variety of scientific and engineering applications, such as sparse linear algebra and image processing. This paper presents an auto-tuning framework that can automatically compute and […]

