Posts
Mar, 29
Exploiting GPUs to investigate an inversion method that retrieves cardiac conductivities from potential measurements
Accurate cardiac bidomain conductivity values are essential for realistic simulation of various cardiac electrophysiological phenomena. A method was previously developed that can determine the conductivities from measurements of potential on a multi-electrode array placed on the surface of the heart. These conductivities, as well as a value for fibre rotation, are determined using a mathematical […]
Mar, 29
Literature review: Build and Travel KD-Tree with CUDA
Ray tracing is an important and widely used tool in computer graphic. Entertainment and game industry have already benet a lot from ray tracing. However, designers and end-users are forced to use off-line ray tracing tools for a long time due to the high computation load. In ray tracing, most of the computation is concentrated […]
Mar, 29
Hardware/Software Vectorization for Closeness Centrality on Multi-/Many-Core Architectures
Centrality metrics have shown to be highly correlated with the importance and loads of the nodes in a network. Given the scale of today’s social networks, it is essential to use efficient algorithms and high performance computing techniques for their fast computation. In this work, we exploit hardware and software vectorization in combination with fine-grain […]
Mar, 28
Improving Cache Locality for GPU-based Volume Rendering
We present a cache-aware method for accelerating texture-based volume rendering on a graphics processing unit (GPU). Because a GPU has hierarchical architecture in terms of processing and memory units, cache optimization is important to maximize performance for memory-intensive applications. Our method localizes texture memory reference according to the location of the viewpoint and dynamically selects […]
Mar, 28
GPU-accelerated automatic identification of robust beam setups for proton and carbon-ion radiotherapy
We demonstrate acceleration on graphic processing units (GPU) of automatic identification of robust particle therapy beam setups, minimizing negative dosimetric effects of Bragg peak displacement caused by treatment-time patient positioning errors. Our particle therapy research toolkit, RobuR, was extended with OpenCL support and used to implement calculation on GPU of the Port Homogeneity Index, a […]
Mar, 28
Implementation of Just In Time Value Specialization for the Optimization of Data Parallel Kernels
This dissertation explores just-in-time (JIT) specialization as an optimization for OpenCL data-parallel compute kernels. It describes the implementation and performance of two extensions to OpenCL, Bacon and Specialization Annotated OpenCL (SOCL). Bacon is a replacement interface for OpenCL that provides improved usability and has JIT specialization built in. SOCL is a simple extension to OpenCL […]
Mar, 28
Pulse-coupled neural network performance for real-time identification of vegetation during forced landing
Safety concerns in the operation of autonomous aerial systems require safe-landing protocols be followed during situations where the mission should be aborted due to mechanical or other failure. This article presents a pulse-coupled neural network (PCNN) to assist in the vegetation classification in a vision-based landing site detection system for an unmanned aircraft. We propose […]
Mar, 27
Jacobian-free Newton-Krylov methods with GPU acceleration for computing nonlinear ship wave patterns
The nonlinear problem of steady free-surface flow past a submerged source is considered as a case study for three-dimensional ship wave problems. Of particular interest is the distinctive wedge-shaped wave pattern that forms on the surface of the fluid. By reformulating the governing equations with a standard boundary-integral method, we derive a system of nonlinear […]
Mar, 27
2014 Workshop on Heterogeneous and Unconventional Cluster Architectures and Applications, HUCAA 2014, in conjunction with ICPP2014
The workshop on Heterogeneous and Unconventional Cluster Architectures and Applications gears to gather recent work on heterogeneous and unconventional cluster architectures and applications, which might have a big impact on future cluster architectures. This includes any cluster architecture that is not based on the usual commodity components and therefore makes use of some special hard- […]
Mar, 26
Accelerating GPU Implementation of Contourlet Transform
The widespread usage of the contourlet-transform (CT) and today’s real-time needs demand faster execution of CT. Solutions are available, but due to lack of portability or computational intensity, they are disadvantageous in real-time applications. In this paper we take advantage of modern GPUs for the acceleration purpose. GPU is well-suited to address data-parallel computation applications […]
Mar, 26
A New Parallel Implementation of DSI Based Disparity Computation Using CUDA
Stereo matching techniques are used to extract 3D information from 2D stereo pair of images. It can be classified into feature based approach, window (area) based approach, and optimization based approach. Feature based approach generally generates sparse disparity map with high accuracy and low execution time. Window based approach produces dense disparity map with low […]
Mar, 25
BigKernel — High Performance CPU-GPU Communication Pipelining for Big Data-style Applications
GPUs offer an order of magnitude higher compute power and memory bandwidth than CPUs. GPUs therefore might appear to be well suited to accelerate computations that operate on voluminous data sets in independent ways; e.g., for transformations, filtering, aggregation, partitioning or other ”Big Data” style processing. Yet experience indicates that it is difficult, and often […]