6895

Posts

Jan, 4

Evaluating polynomials in several variables and their derivatives on a GPU computing processor

In order to obtain more accurate solutions of polynomial systems with numerical continuation methods we use multiprecision arithmetic. Our goal is to offset the overhead of double double arithmetic accelerating the path trackers and in particular Newton’s method with a general purpose graphics processing unit. In this paper we describe algorithms for the massively parallel […]
Jan, 4

Thermal and Athermal Swarms of Self-Propelled Particles

Swarms of self-propelled particles exhibit complex behavior that can arise from simple models, with large changes in swarm behavior resulting from small changes in model parameters. We investigate the steady-state swarms formed by self-propelled Morse particles in three dimensions using molecular dynamics simulations optimized for GPUs. We find a variety of swarms of different overall […]
Jan, 4

Decoupled Deferred Shading for Hardware Rasterization

In this paper we present decoupled deferred shading: a rendering technique based on a new data structure called compact geometry buffer, which stores shading samples independently from the visibility. This enables caching and efficient reuse of shading computation, e.g. for stochastic rasterization techniques. In contrast to previous methods, our decoupled shading can be efficiently implemented […]
Jan, 4

Building Source-to-Source Compilers for Heterogeneous Targets

Heterogeneous computers – platforms that make use of multiple specialized devices to achieve high throughput or low energy consumption – are difficult to program. Hardware vendors usually provide compilers from a C dialect to their machines, but complete application rewriting is frequently required to take advantage of them. In this thesis, we propose a new […]
Jan, 4

GPU TV-L1 Optical Flow

Determining optical flow, the pattern of apparent motion of objects caused by the relative motion between observer and objects in the scene, is a fundamental problem in computer vision. Given two images, goal is to compute the 2D motion field – a projection of 3D velocities of surface points onto the imaging surface. Optical flow […]
Jan, 4

Parallel Implementation Algorithm of Motion Estimation for GPU Applications

The video coding standard H.264/AVC can achieve higher coding efficiency than previous standards. However, it comes at the expense of an increased encoding complexity, especially for motion estimation process which induces very time consuming task even for current central processing units (CPU). On the other hand, due to the rapid growth of the processing capability […]
Jan, 4

Efficient and Good Delaunay Meshes From Random Points

We present a Conforming Delaunay Triangulation (CDT) algorithm based on maximal Poisson disk sampling. Points are unbiased, meaning the probability of introducing a vertex in a disk-free subregion is proportional to its area, except in a neighborhood of the domain boundary. In contrast, Delaunay refinement CDT algorithms place points dependent on the geometry of empty […]
Jan, 3

GPGPU Accelerated Texture-Based Radiosity

Radiosity is a popular global illumination algorithm capable of achieving photorealistic rendering results. However, its use in interactive environments is limited by its computational complexity. This paper presents a GPGPU-based implementation of the gathering radiosity approach using texture-based discretisation and the OpenCL framework. Hemicubes are rendered to a texture array and processed by OpenCL kernels […]
Jan, 3

OpenCL Sparse Linear Solver for Circuit Simulation

Sparse linear systems are found in many common scientific and engineering problems. In VLSI CAD tools, performing DC circuit analysis can create large, sparse systems represented by huge matrices. Solving such systems can take orders of magnitude of time to compute. Many attempts have been made to parallelize algorithms to solve these matrices. Graphics cards, […]
Jan, 3

Architecture-Aware Mapping and Optimization on a 1600-Core GPU

The graphics processing unit (GPU) continues to make in-roads as a computational accelerator for highperformance computing (HPC). However, despite its increasing popularity, mapping and optimizing GPU code remains a difficult task; it is a multi-dimensional problem that requires deep technical knowledge of GPU architecture. Although substantial literature exists on how to map and optimize GPU […]
Jan, 3

A Scalable Framework for Monte Carlo Simulation Using FPGA-based Hardware Accelerators with Application to SPECT Imaging

As the number of transistors that are integrated onto a silicon die continues to increase, the compute power is becoming a commodity. This has enabled a whole host of new applications that rely on high-throughput computations. Recently, the need for faster and cost-effective applications in form-factor constrained environments has driven an interest in on-chip acceleration […]
Jan, 3

Research and Application of Parallel Computing Technologies based on CUDA and OpenCL

The increased computational performance in science and engineering has led to the strong need for arithmetic intensive parallel computing, CUDA (Compute Unified Device Architecture) and OpenCL (Open Computing Language) are parallel computing technologies proposed in recent years, which both have a great expanse of application prospect in the field of high performance computing. In this […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org