## Posts

Nov, 3

### Maximum mipmaps for fast, accurate, and scalable dynamic height field rendering

This paper presents a GPU-based, fast, and accurate dynamic height field rendering technique that scales well to large scale height fields. Current real-time rendering algorithms for dynamic height fields employ approximate ray-height field intersection methods, whereas accurate algorithms require pre-computation in the order of seconds to minutes and are thus not suitable for dynamic height […]

Nov, 3

### Neural Network Implementation Using CUDA and OpenMP

Many algorithms for image processing and pattern recognition have recently been implemented on GPU (graphic processing unit) for faster computational times. However, the implementation using GPU encounters two problems. First, the programmer should master the fundamentals of the graphics shading languages that require the prior knowledge on computer graphics. Second, in a job which needs […]

Nov, 3

### High-Precision Numerical Simulations of Rotating Black Holes Accelerated by CUDA

Hardware accelerators (such as Nvidia’s CUDA GPUs) have tremendous promise for computational science, because they can deliver large gains in performance at relatively low cost. In this work, we focus on the use of Nvidia’s Tesla GPU for high-precision (double, quadruple and octal precision) numerical simulations in the area of black hole physics — more […]

Nov, 3

### PAPER – Accelerating parallel evaluations of ROCS

Modern graphics processing units (GPUs) are flexibly programmable and have peak computational throughput significantly faster than conventional CPUs. Herein, we describe the design and implementation of PAPER, an open-source implementation of Gaussian molecular shape overlay for NVIDIA GPUs. We demonstrate one to two order-of-magnitude speedups on high-end commodity GPU hardware relative to a reference CPU […]

Nov, 3

### Real-time KD-tree construction on graphics hardware

We present an algorithm for constructing kd-trees on GPUs. This algorithm achieves real-time performance by exploiting the GPU’s streaming architecture at all stages of kd-tree construction. Unlike previous parallel kd-tree algorithms, our method builds tree nodes completely in BFS (breadth-first search) order. We also develop a special strategy for large nodes at upper tree levels […]

Nov, 3

### Optimal loop unrolling for GPGPU programs (thesis)

Graphics Processing Units (GPUs) are massively parallel, many-core processorswith tremendous computational power and very high memory bandwidth. GPUs areprimarily designed for accelerating 3D graphics applications on modern computersystems and are therefore, specialized for highly data parallel, compute intensiveproblems, unlike general-purpose CPUs. In recent times, there has been significantinterest in finding ways to accelerate general purpose […]

Nov, 3

### Optimal loop unrolling for GPGPU programs

Graphics Processing Units (GPUs) are massively parallel, many-core processors with tremendous computational power and very high memory bandwidth. With the advent of general purpose programming models such as NVIDIA’s CUDA and the new standard OpenCL, general purpose programming using GPUs (GPGPU) has become very popular. However, the GPU architecture and programming model have brought along […]

Nov, 3

### Computer vision signal processing on graphics processing units

In some sense, computer graphics and computer vision are inverses of one another. Special purpose computer vision hardware is rarely found in typical mass-produced personal computers, but graphics processing units (GPUs) found on most personal computers, often exceed (in number of transistors as well as in compute power) the capabilities of the Central Processing Unit […]

Nov, 3

### FDTD calculations using graphical processing units

This paper deals with using tools commonly available to programmers to implement the finite difference time domain (FDTD) calculations using video cards. In the past few years developments in the field of graphic processing units (CPU’s) for video cards have vastly outpaced their general central processing unit (CPU) counterparts. As specifically applied to vector mathematic […]

Nov, 3

### Quantum Monte Carlo on graphical processing units

Quantum Monte Carlo (QMC) is among the most accurate methods for solving the time independent Schrodinger equation. Unfortunately, the method is very expensive and requires a vast array of computing resources in order to obtain results of a reasonable convergence level. On the other hand, the method is not only easily parallelizable across CPU clusters, […]

Nov, 3

### Molecular Dynamics Simulation of Macromolecules Using Graphics Processing Unit

Molecular dynamics (MD) simulation is a powerful computational tool to studythe behavior of macromolecular systems. But many simulations of this field arelimited in spatial or temporal scale by the available computational resource.In recent years, graphics processing unit (GPU) provides unprecedentedcomputational power for scientific applications. Many MD algorithms suit withthe multithread nature of GPU. In this […]

Nov, 2

### On modelling of anisotropic viscoelasticity for soft tissue simulation: numerical solution and GPU execution

Efficient and accurate techniques for simulation of soft tissue deformation are an increasingly valuable tool in many areas of medical image computing, such as biomechanically-driven image registration and interactive surgical simulation. For reasons of efficiency most analyses are based on simplified linear formulations, and previously almost all have ignored well established features of tissue mechanical […]