Posts
Jan, 31
Graphical processing unit implementation of an integrated shape-based active contour: Application to digital pathology
Commodity graphics hardware has become a cost-effective parallel platform to solve many general computational problems. In medical imaging and more so in digital pathology, segmentation of multiple structures on high-resolution images, is often a complex and computationally expensive task. Shape-based level set segmentation has recently emerged as a natural solution to segmenting overlapping and occluded […]
Jan, 31
An OpenCL implementation for the solution of TDSE on GPU and CPU architectures
Open Computing Language (OpenCL) is a parallel processing language that is ideally suited for running parallel algorithms on Graphical Processing Units (GPUs). In the present work we report the development of a generic parallel single-GPU code for the numerical solution of a system of first-order ordinary differential equations (ODEs) based on the openCL model. We […]
Jan, 30
Algorithmic Contributions to the Theory of Regular Chains
Regular chains, introduced about twenty years ago, have emerged as one of the major tools for solving polynomial systems symbolically. In this thesis, we focus on different algorithmic aspects of the theory of regular chains, from theoretical questions to high-performance implementation issues. The inclusion test for saturated ideals is a fundamental problem in this theory. […]
Jan, 30
Fast CT Image Processing using Parallelized Non-local Means
Reducing the radiation dose delivered to patients has been an important concern since the introduction of X-ray computed tomography (CT). However, low-dose CT images tend to be severely degraded by noise. This paper proposes using parallelized non-local means (PNM) under a computation framework for improving low-dose X-ray CT images. For the proposed PNM method, the […]
Jan, 30
Numerical Ocean Modeling and Simulation with CUDA
ROMS is software that models and simulates an ocean region using a finite difference grid and time stepping. ROMS simulations can take from hours to days to complete due to the compute-intensive nature of the software. As a result, the size and resolution of simulations are constrained by the performance limitations of modern computing hardware. […]
Jan, 30
On CUDA implementation of a multichannel room impulse response reshaping algorithm based on p-norm optimization
By using room impulse response shortening and shaping it is possible to reduce the reverberation effects and therefore improve speech intelligibility. This may be achieved by a prefilter that modifies the overall impulse response to have a stronger attenuation. For achieving a spatial robustness, multichannel approaches have been proposed. Unfortunately, these approaches suffer from a […]
Jan, 30
How well do STARLAB and NBODY compare? II: Hardware and accuracy
Most recent progress in understanding the dynamical evolution of star clusters relies on direct N-body simulations. Owing to the computational demands, and the desire to model more complex and more massive star clusters, hardware calculational accelerators, such as GRAPE special-purpose hardware or, more recently, GPUs (i.e. graphics cards), are generally utilised. In addition, simulations can […]
Jan, 30
Efficient Password and Key recovery using Graphic Cards
Passwords are without doubt the most common means for authentication throughout all kinds of applications on computer systems, ranging from local or online-service user logins to the protection of sensitive data by password based encryption. However, wherever passwords are employed, these are prone to loss or disremembering, an effect which, especially driven by the advent […]
Jan, 30
CUDA Expression Templates
Many algorithms require vector algebra operations such as the dot product, vector norms or component-wise manipulations. Especially for large-scale vectors, the efficiency of algorithms depends on an efficient implementation of those calculations. The calculation of vector operations benefits from the continually increasing chip level parallelism on graphics hardware. Very efficient basic linear algebra libraries like […]
Jan, 30
Performance Evaluation of Query Processing Algorithms on GPGPUs
Modern Graphical Processing Units (GPUs) can perform general purpose computing, next to standard graphical processing. Open frameworks, such as the OpenCL standard by the Khronos Group, enable developers to easily harness the computational power of GPUs. While in certain aspects, these are more powerful than standard CPUs, the latter are still a more suitable solution […]
Jan, 30
Towards a Tunable Multi-Backend Skeleton Programming Framework for Multi-GPU Systems
SkePU is a C++ template library that provides a simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA and OpenCL. The interface is also general enough to support other architectures, and SkePU implements both a sequential CPU and a parallel OpenMP backend. It also supports multi-GPU systems. […]
Jan, 30
Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach
Among the various factors that drive the momentous changes occurring in the design of microprocessors and high end systems [1], three stand out as especially notable: 1. the number of transistors per chip will continue the current trend, i.e. double roughly every 18 months, while the speed of processor clocks will cease to increase; 2. […]

