high performance computing on graphics processing units: hgpu.org

Posts

Mar, 23

Modeling GPU-CPU Workloads and Systems

Heterogeneous systems, systems with multiple processors tailored for specialized tasks, are challenging programming environments. While it may be possible for domain experts to optimize a high performance application for a very specific and well documented system, it may not perform as well or even function on a different system. Developers who have less experience with […]

Mar, 23

Gvim: Gpu-accelerated virtual machines

The use of virtualization to abstract underlying hardware can aid in sharing such resources and in efficiently managing their use by high performance applications. Unfortunately, virtualization also prevents efficient access to accelerators, such as Graphics Processing Units (GPUs), that have become critical components in the design and architecture of HPC systems. Supporting General Purpose computing […]

CUDA

Mar, 23

GPU-Assisted Computation of Centroidal Voronoi Tessellation

Centroidal Voronoi tessellations (CVT) are widely used in computational science and engineering. The most commonly used method is Lloyds method, and recently the L-BFGS method is shown to be faster than Lloyds method for computing the CVT. However, these methods run on the CPU and are still too slow for many practical applications. We present […]

Mar, 23

GPU Random Numbers via the Tiny Encryption Algorithm

Random numbers are extensively used on the GPU. As more computation is ported to the GPU, it can no longer be treated as rendering hardware alone. Random number generators (RNG) are expected to cater general purpose and graphics applications alike. Such diversity adds to expected requirements of a RNG. A good GPU RNG should be […]

CUDA

Mar, 23

Aspects of GPU for general purpose high performance computing

We discuss hardware and software aspects of GPGPU, specifically focusing on NVIDIA cards and CUDA, from the viewpoints of parallel computing. The major weak points of GPU against newest supercomputers are identified to be and summarized as only four points: large SIMD vector length, small memory, absence of fast L2 cache, and high register spill […]

CUDA

Mar, 23

Dense linear algebra solvers for multicore with GPU accelerators

Solving dense linear systems of equations is a fundamental problem in scientific computing. Numerical sim- ulations involving complex systems represented in terms of unknown variables and relations between them often lead to linear systems of equations that must be solved as fast as possible. We describe current efforts toward the development of these critical solvers […]

Mar, 23

GPU Accelerators for Evolvable Cellular Automata

In order to design cellular automata rules by means of evolutionary algorithms, high computational demands need to be met. This problem may be partially solved by parallelization. Since parallel supercomputers and server clusters are expensive and often overburdened, this paper proposes the evolution of cellular automata rules on small and inexpensive graphic processing units. The […]

CUDA

Mar, 23

Data Visualization and Mining using the GPU

An exciting development in the computing industry has been the emergence of graphics processing units (the GPU) as a fast general purpose co-processor. Initially designed for gaming applications, todays GPUs demonstrate impressive computing power and high levels of parallelism and are now being used for a variety of applications far removed from traditional graphics rendering […]

Mar, 23

On Using GPU to Compute Options and Derivatives

Algorithmic Trading has created an increasing demand for high performance computing solutions within financial organizations. The actors of portfolio management and risk assessment have the obligation to increase their computing resources in order to provide competitive models for financial management and pricing financial instruments. GPU Stands for “Graphic Processing Unit”. GPU processing (or Stream Processing) […]

CUDA

Mar, 23

Efficient stream reduction on the GPU

Stream reduction is the process of removing unwanted elements from a stream of outputs. It is a key component of many GPGPU algorithms, especially in multi-pass algorithms: the stream reduction is used to remove unwanted elements from the output of a previous pass before sending it as input for the next pass. In this paper, […]

OpenGL

Mar, 22

A GPGPU solution of the FMM near interactions for acoustic scattering problems

The Fast Multipole Method (FMM) is specially suitable for applications in which it is necessary to predict the acoustic scattering, e.g., aircraft noise control. This accelerated iterative method has two main parts, far interactions and near interactions. Near interactions are computationally intensive and they fit properly in the Single Instruction Multiple Threads paradigm. In this […]

Mar, 22

Fast and accurate PIV computation using highly parallel iterative correlation maximization

Our contribution deals with fast computation of dense two-component (2C) PIV vector fields using Graphics Processing Units (GPUs). We show that iterative gradient-based cross-correlation optimization is an accurate and efficient alternative to multi-pass processing with FFT-based cross-correlation. Density is meant here from the sampling point of view (we obtain one vector per pixel), since the […]

CUDA