2837

Posts

Feb, 1

SPH on GPU with CUDA

A Smoothed Particle Hydrodynamics (SPH) method for free surface flows has been implemented on a graphical processing unit (GPU) using the Compute Unified Device Architecture (CUDA) developed by Nvidia, resulting in tremendous speed-ups. The entire SPH code, with its three main components: neighbor list construction, force computation, and integration of the equation of motion, is […]
Feb, 1

Constraint Fluids on GPU

The processing power of graphics hardware has increased tremendously in the last several years and they are therefore used more and more outside of their intended domain of graphics rendering. This thesis describes the implementation and results of a fluid simulator, using the constraint fluid method, which harnesses the processing power of modern GPUs, in […]
Feb, 1

Computational Fluid Dynamic on GPU

Computational Fluid Dynamics, an important branch in HPC field, has a history of seeking and requiring higher computational performance. The traditional way to satisfy this quest is to use faster machines or supercomputers. Yet these approaches seem inconvenient and costly to many individual researchers. We investigated the use of GPU to accelerate CFD codes and […]
Feb, 1

GPU as a Parallel Machine: Sorting on the GPU

Sorting is a fundamental algorithmic building block. One of the most studied problems in computer science is ordering a list of items efficiently. Buck and Purcell showed how the parallel bitonic merge sort algorithm, could exploit many of the parallel features of the SIMD architecture of the GPU. Efficient sorting has practical importance to optimizing […]
Feb, 1

Introduction to GPU programming for EDA

Advances in GPU technology have propelled the GPU into arenas far afield from the traditional, isolated roles they have previously played. With hundreds of processing units in a single GPU, substantial speedups can be achieved by harnessing their power to augment the performance of the traditional single- or multi-core CPU on certain compute-intensive applications. However, […]
Feb, 1

Uncontracted Rys Quadrature Implementation of up to G Functions on Graphical Processing Units

An implementation is presented of an uncontracted Rys quadrature algorithm for electron repulsion integrals, including up to g functions on graphical processing units (GPUs). The general GPU programming model, the challenges associated with implementing the Rys quadrature on these highly parallel emerging architectures, and a new approach to implementing the quadrature are outlined. The performance […]
Jan, 31

Towards Automated Learning of Object Detectors

Recognizing arbitrary objects in images or video sequences is a difficult task for a computer vision system. We work towards automated learning of object detectors from video sequences (without user interaction). Our system uses object motion as an important cue to detect independently moving objects in the input sequence. The largest object is always taken […]
Jan, 31

Batched Multi Triangulation

The multi triangulation framework (MT) is a very general approach for managing adaptive resolution in triangle meshes. The key idea is arranging mesh fragments at different resolution in a directed acyclic graph (DAG) which encodes the dependencies between fragments, thereby encompassing a wide class of multiresolution approaches that use hierarchies or DAGs with predefined topology. […]
Jan, 31

Advanced Multi-Frame Rate Rendering Techniques

Multi-frame rate rendering is a parallel rendering technique that renders interactive parts of the scene on one graphics card while the rest of the scene is rendered asynchronously on a second graphics card. The resulting color and depth images of both render processes are composited and displayed. This paper presents advanced multi-frame rate rendering techniques, […]
Jan, 31

Isosurface Extraction and View-Dependent Filtering from Time-Varying Fields Using Persistent Time-Octree (PTOT)

We develop a new algorithm for isosurface extraction and view-dependent filtering from large time-varying fields, by using a novel persistent time-octree (PTOT) indexing structure. Previously, the persistent octree (POT) was proposed to perform isosurface extraction and view-dependent filtering, which combines the advantages of the interval tree (for optimal searches of active cells) and of the […]
Jan, 31

Scientific Computing on Heterogeneous Architectures

The CPU has traditionally been the computational work horse in scientific computing, but we have seen a tremendous increase in the use of accelerators, such as Graphics Processing Units (GPUs), in the last decade. These architectures are used because they consume less power and offer higher performance than equivalent CPU solutions. They are typically also […]
Jan, 31

OpenRCL: Low-Power High-Performance Computing with Reconfigurable Devices

This work presents the Open Reconfigurable Computing Language (OpenRCL) system designed to enable low-power high-performance reconfigurable computing with imperative programming language such as C/C++. The key idea is to expose the FPGA platform as a compiler target for applications expressed in the OpenCL paradigm. To this end, we present a combination of low-level virtual machine […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: