Posts
Nov, 19
Accelerating numerical solution of stochastic differential equations with CUDA
Numerical integration of stochastic differential equations is commonly used in many branches of science. In this paper we present how to accelerate this kind of numerical calculations with popular NVIDIA Graphics Processing Units using the CUDA programming environment. We address general aspects of numerical programming on stream processors and illustrate them by two examples: the […]
Nov, 19
Using Graphics Processors to Facilitate Explicit Digital Electrochemical Simulation: Theory of Elliptical Disc Electrodes
The use of graphics processors under the heading GPGPU (General-Purpose computation on GPUs (Graphics Processing Units)) promises a computational advance which may greatly facilitate the use of explicit digital simulation for non-trivial problems. This paper illustrates the use of GPGPU for the simulation of mass transport processes at elliptically shaped electrodes and for deformed microelectrodes. […]
Nov, 19
An optimised radial basis function algorithm for fast non-rigid registration of medical images
The registration of multi-modal medical image data is important in the fields of image guided surgery and computer aided medical diagnosis. Registration accuracy is of utmost importance in both fields, however in the former, the speed of registration is equally important. In this paper, we present a point-based “fast” non-rigid registration algorithm which exhibits significant […]
Nov, 19
Evaluation of parallel particle swarm optimization algorithms within the CUDA architecture
Particle swarm optimization (PSO), like other population-based meta-heuristics, is intrinsically parallel and can be effectively implemented on Graphics Processing Units (GPUs), which are, in fact, massively parallel processing architectures. In this paper we discuss possible approaches to parallelizing PSO on graphics hardware within the Compute Unified Device Architecture (CUDA), a GPU programming environment by nVIDIA […]
Nov, 19
Parallel Implementation on GPUs of ADI Finite Difference Methods for Parabolic PDEs with Applications in Finance
We study the parallel implementation on a Graphics Processing Unit (GPU) of Alternating Direction Implicit (ADI) time-discretization methods for solving time-dependent parabolic Partial Differential Equations (PDEs) in three spatial dimensions with mixed spatial derivatives in a variety of applications in computational finance. Finite differences on uniform grids are used for the spatial discretization of the […]
Nov, 19
High-Performance Iterative Electron Tomography Reconstruction with Long-Object Compensation using Graphics Processing Units (GPUs)
Iterative reconstruction algorithms pose tremendous computational challenges for 3D Electron Tomography (ET). Similar to X-ray Computed Tomography (CT), graphics processing units (GPUs) offer an affordable platform to meet these demands. In this paper, we outline a CT reconstruction approach for ET that is optimized for the special demands and application setting of ET. It exploits […]
Nov, 19
Rodinia: A benchmark suite for heterogeneous computing
This paper presents and characterizes Rodinia, a benchmark suite for heterogeneous computing. To help architects study emerging platforms such as GPUs (Graphics Processing Units), Rodinia includes applications and kernels which target multi-core CPU and GPU platforms. The choice of applications is inspired by Berkeley’s dwarf taxonomy. Our characterization shows that the Rodinia benchmarks cover a […]
Nov, 19
Ultra-fast FFT protein docking on graphics processors
MOTIVATION: Modelling proteinaprotein interactions (PPIs) is an increasingly important aspect of structural bioinformatics. However, predicting PPIs using in silico docking techniques is computationally very expensive. Developing very fast protein docking tools will be useful for studying large-scale PPI networks, and could contribute to the rational design of new drugs. RESULTS: The Hex spherical polar Fourier […]
Nov, 19
Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures
This work presents the first extensive study of single-node performance optimization, tuning, and analysis of the fast multipole method (FMM) on modern multi-core systems. We consider single- and double-precision with numerous performance enhancements, including low-level tuning, numerical approximation, data structure transformations, OpenMP parallelization, and algorithmic tuning. Among our numerous findings, we show that optimization and […]
Nov, 19
Design and Performance Evaluation of Image Processing Algorithms on GPUs
In this paper, we construe key factors in design and evaluation of image processing algorithms on the massive parallel GPU (graphics processing units) using the CUDA (compute unified device architecture) programming model. A set of metrics, customized for image processing, are proposed to quantitatively evaluate algorithm characteristics. In addition, we show that a range of […]
Nov, 19
Multi-dimensional characterization of temporal data mining on graphics processors
Through the algorithmic design patterns of data parallelism and task parallelism, the graphics processing unit (GPU) offers the potential to vastly accelerate discovery and innovation across a multitude of disciplines. For example, the exponential growth in data volume now presents an obstacle for high-throughput data mining in fields such as neuroscience and bioinformatics. As such, […]
Nov, 19
A two-level real-time vision machine combining coarse- and fine-grained parallelism
In this paper, we describe a real-time vision machine having a stereo camera as input generating visual information on two different levels of abstraction. The system provides visual low-level and mid-level information in terms of dense stereo and optical flow, egomotion, indicating areas with independently moving objects as well as a condensed geometric description of […]