Posts
Oct, 29
Morphological Proximity Priors: Spatial Relationships for Semantic Segmentation
The introduction of prior knowledge into image analysis algorithms is a central challenge in computer vision. In this paper, we introduce the concept of proximity priors into semantic segmentation methods in order to penalize the proximity of certain object classes. Proximity priors are a generalization of purely global and purely local co-occurrence priors which have […]
Oct, 29
A Comparison of Two Methods for Geometric Milling Simulation Accelerated by GPU
For detecting potential problems of a cutter path, cutting force simulation in the NC milling process is necessary prior to actual machining. A milling operation is geometrically equivalent to a Boolean subtraction of the swept volume of a cutter moving along a path from a solid model representing the stock shape. In order to precisely […]
Oct, 29
GPU-Mapping: Robotic Map Building with Graphical Multiprocessors
This paper provides a wide perspective of the potential applicability of Graphical Processing Units (GPUs) computing power in robotics, specifically in the well known problem of 2D robotic mapping. There are three possible ways of exploiting these massively parallel devices: I) parallelizing existing algorithms, II) integrating already existing parallelized general purpose software, and III) making […]
Oct, 29
First Evaluation of the CPU, GPGPU and MIC Architectures for Real Time Particle Tracking based on Hough Transform at the LHC
Recent innovations focused around parallel processing, either through systems containing multiple processors or processors containing multiple cores, hold great promise for enhancing the performance of the trigger at the LHC and extending its physics program. The flexibility of the CMS/ATLAS trigger system allows for easy integration of computational accelerators, such as NVIDIA’s Tesla Graphics Processing […]
Oct, 29
Extension of the SkePU Skeleton Programming Framework for Multi-core CPU and Multi-GPU Systems for MPI-based Clusters
SkePU (Skeleton Programming Framework for Multi-core CPU and Multi-GPU Systems) is a parallel computing framework developed by Johan Enmyren and Christoph Kessler at Linkopings Universitet. This C++ template library provides a simple and unified interface for specifying data-parallel computations with the help of skeletons and is targeted to multiple backends e.g. for a sequential CPU, […]
Oct, 29
High-performance Dynamic Programming on FPGAs with OpenCL
Field programmable gate arrays (FPGAs) provide reconfigurable computing fabrics that can be tailored to a wide range of time and power sensitive applications. Traditionally, programming FPGAs required an expertise in complex hardware description languages (HDLs) or proprietary high-level synthesis (HLS) tools. Recently, Altera released the worlds first OpenCL conformant SDK for FPGAs. OpenCL is an […]
Oct, 29
Molecular Simulations using CUDA
Computer simulations play a vital role in understanding the phase behavior of colloidal dispersions, however, most simulation results suffer from finite-size effects. These finite-size effects can be eliminated by finite-size scaling or by simulating large system sizes. In this thesis we show how to simulate large system sizes efficiently on Graphical Processing Units (GPUs). Whereas […]
Oct, 29
A Locality-Aware Memory Hierarchy for Energy-Efficient GPU Architectures
As GPU’s compute capabilities grow, their memory hierarchy increasingly becomes a bottleneck. Current GPU memory hierarchies use coarse-grained memory accesses to exploit spatial locality, maximize peak bandwidth, simplify control, and reduce cache meta-data storage. These coarse-grained memory accesses, however, are a poor match for emerging GPU applications with irregular control flow and memory access patterns. […]
Oct, 29
Fast distributed phononic band-structure calculations through a GPU accelerated mixed-variational formulation
In this paper we present a Graphical Processing Unit accelerated mixed variational formulation for fast phononic band-structure calculation of arbitrarily complex unit cells and report speed gains of a hundred fold over unoptimized serial cpu computations. To the author’s knowledge this is the first application of gpu computing to a non-FE/FDTD bandstructure algorithm. The formulation […]
Oct, 29
Computing finite models using free Boolean generators
A parallel method for computing Boolean expressions based on the properties of finite free Boolean algebras is presented. We also show how various finite combinatorial objects can be codded in the formalism of Boolean algebras and counted by this procedure. Particularly, using a translation of first order predicate formulas to propositional formulas, we give a […]
Oct, 27
Hybrid Parallel Light-Weight Programming of Hybrid Systems
The advent of homogeneous many-core processors has been widely noticed as a major shift in the architecture of commodity computer systems. It has influenced the design of operating systems and programming models and gives a boost to high-level parallelization libraries. Future commodity systems will combine homogeneous many-core processors with graphical processing units and other special […]
Oct, 27
Dynamical simulations of extrasolar planetary systems with debris disks using a GPU accelerated N-body code
This thesis begins with a description of a hybrid symplectic integrator named QYMSYM which is capable of planetary system simulations. This integrator has been programmed with the Compute Unified Device Architecture (CUDA) language which allows for implementation on Graphics Processing Units (GPUs). With the enhanced compute performance made available by this choice, QYMSYM was used […]