1057

Posts

Oct, 27

Using graphics devices in reverse: GPU-based Image Processing and Computer Vision

Graphics and vision are approximate inverses of each other: ordinarily graphics processing units (GPUs) are used to convert ldquonumbers into picturesrdquo (i.e. computer graphics). In this paper, we discuss the use of GPUs in approximately the reverse way: to assist in ldquoconverting pictures into numbersrdquo (i.e. computer vision). For graphical operations, GPUs currently provide many […]
Oct, 27

Stackless KD-Tree Traversal for High Performance GPU Ray Tracing

Abstract Significant advances have been achieved for realtime ray tracing recently, but realtime performance for complex scenes still requires large computational resources not yet available from the CPUs in standard PCs. Incidentally, most of these PCs also contain modern GPUs that do offer much larger raw compute power. However, limitations in the programming and memory […]
Oct, 27

PyCUDA: GPU Run-Time Code Generation for High-Performance Computing

High-performance scientific computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving […]
Oct, 27

Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU

Recent advances in computing have led to an explosion in the amount of data being generated. Processing the ever-growing data in a timely manner has made throughput computing an important aspect for emerging applications. Our analysis of a set of important throughput computing kernels shows that there is an ample amount of parallelism in these […]
Oct, 27

Efficient GPU-Based Texture Interpolation using Uniform B-Splines

This article presents uniform B-spline interpolation, completely contained on the graphics processing unit (GPU). This implies that the CPU does not need to compute any lookup tables or B-spline basis functions. The cubic interpolation can be decomposed into several linear interpolations [Sigg and Hadwiger 05], which are hard-wired on the GPU and therefore very fast. […]
Oct, 27

GPU-based ultrafast IMRT plan optimization

The widespread adoption of on-board volumetric imaging in cancer radiotherapy has stimulated research efforts to develop online adaptive radiotherapy techniques to handle the inter-fraction variation of the patient’s geometry. Such efforts face major technical challenges to perform treatment planning in real time. To overcome this challenge, we are developing a supercomputing online re-planning environment (SCORE) […]
Oct, 27

Understanding the efficiency of GPU algorithms for matrix-matrix multiplication

Utilizing graphics hardware for general purpose numerical computations has become a topic of considerable interest. The implementation of streaming algorithms, typified by highly parallel computations with little reuse of input data, has been widely explored on GPUs. We relax the streaming model’s constraint on input reuse and perform an in-depth analysis of dense matrix-matrix multiplication, […]
Oct, 27

GPU Based Acceleration of Telegraph Equation

In a matter of just a few years, the programmable graphics processor unit has evolved into an absolute computing workhorse. With multiple cores driven by very high memory bandwidth, today’s GPUs offer incredible resources for both graphics and non-graphics processing. An original mathematical method “Modern Taylor Series Method” (MTSM) which uses the Taylor series method […]
Oct, 27

Importance of Explicit Vectorization for CPU and GPU Software Performance

Much of the current focus in high-performance computing is on multi-threading, multi-computing, and graphics processing unit (GPU) computing. However, vectorization and non-parallel optimization techniques, which can often be employed additionally, are less frequently discussed. In this paper, we present an analysis of several optimizations done on both central processing unit (CPU) and GPU implementations of […]
Oct, 27

GPU computing for systems biology

The development of detailed, coherent, models of complex biological systems is recognized as a key requirement for integrating the increasing amount of experimental data. In addition, in-silico simulation of bio-chemical models provides an easy way to test different experimental conditions, helping in the discovery of the dynamics that regulate biological systems. However, the computational power […]
Oct, 27

On the limits of GPU acceleration

This paper throws a small “wet blanket” on the hot topic of GPGPU acceleration, based on experience analyzing and tuning both multithreaded CPU and GPU implementations of three computations in scientific computing. These computations–(a) iterative sparse linear solvers; (b) sparse Cholesky factorization; and (c) the fast multipole method–exhibit complex behavior and vary in computational intensity […]
Oct, 27

GPU sample sort

In this paper, we present the design of a sample sort algorithm for manycore GPUs. Despite being one of the most efficient comparison-based sorting algorithms for distributed memory architectures its performance on GPUs was previously unknown. For uniformly distributed keys our sample sort is at least 25% and on average 68% faster than the best […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: