Posts
Nov, 22
GPU implementation of a road sign detector based on particle swarm optimization
Road Sign Detection is a major goal of the Advanced Driving Assistance Systems. Most published work on this problem share the same approach by which signs are first detected and then classified in video sequences, even if different techniques are used. While detection is usually performed using classical computer vision techniques based on color and/or […]
Nov, 22
CUDA by Example: An Introduction to General-Purpose GPU Programming
CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through working examples. After a concise introduction to the CUDA platform and architecture, as well as a quick-start guide to CUDA C, the book details […]
Nov, 22
Micropolygon ray tracing with defocus and motion blur
We present a micropolygon ray tracing algorithm that is capable of efficiently rendering high quality defocus and motion blur effects. A key component of our algorithm is a BVH (bounding volume hierarchy) based on 4D hyper-trapezoids that project into 3D OBBs (oriented bounding boxes) in spatial dimensions. This acceleration structure is able to provide tight […]
Nov, 22
Octree-based, GPU implementation of a continuous cellular automaton for the simulation of complex, evolving surfaces
Presently, dynamic surface-based models are required to contain increasingly larger numbers of points and to propagate them over longer time periods. For large numbers of surface points, the octree data structure can be used as a balance between low memory occupation and relatively rapid access to the stored data. For evolution rules that depend on […]
Nov, 22
Algorithm level power efficiency optimization for CPU-GPU processing element in data intensive SIMD/SPMD computing
Power efficiency investigation has been required in each level of a High Performance Computing (HPC) system because of the increasing computation demands of scientific and engineering applications. Focusing on handling the critical design constraints in software level that run beyond a parallel system composed of huge numbers of power-hungry components, we optimize HPC program design […]
Nov, 22
Higher-order CFD and Interface Tracking Methods on Highly-Parallel MPI and GPU systems
A computational investigation of the effects on parallel performance of higher-order accurate schemes was carried out on two different computational systems: a traditional CPU based MPI cluster and a system of four Graphics Processing Units (GPUs) controlled by a single quad-core CPU. The investigation was based on the solution of the level set equations for […]
Nov, 22
Efficient simulation of agent-based models on multi-GPU and multi-core clusters
An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory […]
Nov, 22
GPU-accelerated elastic 3D image registration for intra-surgical applications
Local motion within intra-patient biomedical images can be compensated by using elastic image registration. The application of B-spline based elastic registration during interventional treatment is seriously hampered by its considerable computation time. The graphics processing unit (GPU) can be used to accelerate the calculation of such elastic registrations by using its parallel processing power, and […]
Nov, 22
GPU accelerated tensor contractions in the plaquette renormalization scheme
We use the graphical processing unit (GPU) to accelerate the tensor contractions, which is the most time consuming operations in the variational method based on the plaquette renormalized states. Using a frustrated Heisenberg J1-J2 model on a square lattice as an example, we implement the algorithm based on the compute unified device architecture (CUDA). For […]
Nov, 22
GPU-accelerated phase-field simulation of dendritic solidification in a binary alloy
The phase-field simulation for dendritic solidification of a binary alloy has been accelerated by using a Graphic Processing Unit (GPU). To perform the phase-field simulation of the alloy solidification on GPU, a program code was developed with Computer Unified Device Architecture (CUDA). In this paper, the implementation technique of the phase-field model on GPU is […]
Nov, 22
GPU-accelerated molecular modeling coming of age
Graphics processing units (GPUs) have traditionally been used in molecular modeling solely for visualization of molecular structures and animation of trajectories resulting from molecular dynamics simulations. Modern GPUs have evolved into fully programmable, massively parallel co-processors that can now be exploited to accelerate many scientific computations, typically providing about one order of magnitude speedup over […]
Nov, 22
Accelerating electrostatic surface potential calculation with multi-scale approximation on graphics processing units
Tools that compute and visualize biomolecular electrostatic surface potential have been used extensively for studying biomolecular function. However, determining the surface potential for large biomolecules on a typical desktop computer can take days or longer using currently available tools and methods. Two commonly used techniques to speed-up these types of electrostatic computations are approximations based […]