Posts
Oct, 27
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
GPUs have recently attracted the attention of many application developers as commodity data-parallel coprocessors. The newest generations of GPU architecture provide easier programmability and increased generality while maintaining the tremendous memory bandwidth and computational power of traditional GPUs. This opportunity should redirect efforts in GPGPU research from ad hoc porting of applications to establishing principles […]
Oct, 27
A GPU based real-time software correlation system for the Murchison Widefield Array prototype
Modern graphics processing units (GPUs) are inexpensive commodity hardware that offer Tflop/s theoretical computing capacity. GPUs are well suited to many compute-intensive tasks including digital signal processing. We describe the implementation and performance of a GPU-based digital correlator for radio astronomy. The correlator is implemented using the NVIDIA CUDA development environment. We evaluate three design […]
Oct, 27
Using graphics devices in reverse: GPU-based Image Processing and Computer Vision
Graphics and vision are approximate inverses of each other: ordinarily graphics processing units (GPUs) are used to convert ldquonumbers into picturesrdquo (i.e. computer graphics). In this paper, we discuss the use of GPUs in approximately the reverse way: to assist in ldquoconverting pictures into numbersrdquo (i.e. computer vision). For graphical operations, GPUs currently provide many […]
Oct, 27
Stackless KD-Tree Traversal for High Performance GPU Ray Tracing
Abstract Significant advances have been achieved for realtime ray tracing recently, but realtime performance for complex scenes still requires large computational resources not yet available from the CPUs in standard PCs. Incidentally, most of these PCs also contain modern GPUs that do offer much larger raw compute power. However, limitations in the programming and memory […]
Oct, 27
PyCUDA: GPU Run-Time Code Generation for High-Performance Computing
High-performance scientific computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving […]
Oct, 27
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU
Recent advances in computing have led to an explosion in the amount of data being generated. Processing the ever-growing data in a timely manner has made throughput computing an important aspect for emerging applications. Our analysis of a set of important throughput computing kernels shows that there is an ample amount of parallelism in these […]
Oct, 27
Efficient GPU-Based Texture Interpolation using Uniform B-Splines
This article presents uniform B-spline interpolation, completely contained on the graphics processing unit (GPU). This implies that the CPU does not need to compute any lookup tables or B-spline basis functions. The cubic interpolation can be decomposed into several linear interpolations [Sigg and Hadwiger 05], which are hard-wired on the GPU and therefore very fast. […]
Oct, 27
GPU-based ultrafast IMRT plan optimization
The widespread adoption of on-board volumetric imaging in cancer radiotherapy has stimulated research efforts to develop online adaptive radiotherapy techniques to handle the inter-fraction variation of the patient’s geometry. Such efforts face major technical challenges to perform treatment planning in real time. To overcome this challenge, we are developing a supercomputing online re-planning environment (SCORE) […]
Oct, 27
Understanding the efficiency of GPU algorithms for matrix-matrix multiplication
Utilizing graphics hardware for general purpose numerical computations has become a topic of considerable interest. The implementation of streaming algorithms, typified by highly parallel computations with little reuse of input data, has been widely explored on GPUs. We relax the streaming model’s constraint on input reuse and perform an in-depth analysis of dense matrix-matrix multiplication, […]
Oct, 27
GPU Based Acceleration of Telegraph Equation
In a matter of just a few years, the programmable graphics processor unit has evolved into an absolute computing workhorse. With multiple cores driven by very high memory bandwidth, today’s GPUs offer incredible resources for both graphics and non-graphics processing. An original mathematical method “Modern Taylor Series Method” (MTSM) which uses the Taylor series method […]
Oct, 27
Importance of Explicit Vectorization for CPU and GPU Software Performance
Much of the current focus in high-performance computing is on multi-threading, multi-computing, and graphics processing unit (GPU) computing. However, vectorization and non-parallel optimization techniques, which can often be employed additionally, are less frequently discussed. In this paper, we present an analysis of several optimizations done on both central processing unit (CPU) and GPU implementations of […]
Oct, 27
GPU computing for systems biology
The development of detailed, coherent, models of complex biological systems is recognized as a key requirement for integrating the increasing amount of experimental data. In addition, in-silico simulation of bio-chemical models provides an easy way to test different experimental conditions, helping in the discovery of the dynamics that regulate biological systems. However, the computational power […]