Posts
Jul, 21
Detecting parametric objects in large scenes by Monte Carlo sampling
Point processes constitute a natural extension of Markov Random Fields (MRF), designed to handle parametric objects. They have shown efficiency and competitiveness for tackling object extraction problems in vision. Simulating these stochastic models is however a difficult task. The performances of the existing samplers are limited in terms of computation time and convergence stability, especially […]
Jul, 21
The Astrophysical Multipurpose Software Environment
We present the open source Astrophysical Multi-purpose Software Environment (AMUSE, www.amusecode.org), a component library for performing astrophysical simulations involving different physical domains and scales. It couples existing codes within a Python framework based on a communication layer using MPI. The interfaces are standardized for each domain and their implementation based on MPI guarantees that the […]
Jul, 21
Parallel and Concurrent Programming in Haskell: Techniques for Multicore and Multithreaded Programming
This book covers the breadth of Haskell’s diverse selection of programming APIs for concurrent and parallel programming. It is split into two parts. The first part, on parallel programming, covers the techniques for using multiple processors to speed up CPU-intensive computations, including methods for using parallelism in both idiomatic Haskell and numerical array-based algorithms, and […]
Jul, 21
Image reconstruction in digital holographic microscopy on GPU
The aim of the thesis is to implement and optimize chosen image processing algorithms used in digital holographic microscopy on the GPU. The algorithms are 2-D phase unwrapping and polynomial surface fitting. They are described and certain used optimizations are pointed out. The results chapter shows the performance and precision of the GPU implementation compared […]
Jul, 20
Bone Structure Analysis with GPGPUs
Osteoporosis is a disease that affects a growing number of people by increasing the fragility of their bones. To improve the understanding of the bone, large scaled computer simulations are applied. A fast, scalable and memory efficient solver for such problems is ParOSol. It uses the preconditioned conjugate gradient algorithm with a multigrid preconditioner. A […]
Jul, 20
Lattice QCD on Intel Xeon Phi
The Intel Xeon Phi architecture from Intel Corporation features parallelism at the level of many x86-based cores, multiple threads per core, and vector processing units. Lattice Quantum Chromodynamics (LQCD) is currently the only known model independent, non perturbative computational method for calculations in theory of the strong interactions, and is of importance in studies of […]
Jul, 20
GPU Computing in Economics
This paper discusses issues related to GPU for Economic problems. It highlights new methodologies and resources that are available for solving and estimating economic models and emphasizes situations when they are useful and others where they are impractical. Two examples illustrate the different ways these GPU parallel methods can be employed to speed computation.
Jul, 20
Real Time Pixel Art Remasterization on GPUs
Several methods have been proposed to overcome the pixel art scaling problem through the years. In this article we describe a novel approach to be applied through a massively parallel architecture that can address this issue in real time. To achieve this we design a local and context independent algorithm that enables an efficient parallel […]
Jul, 20
An Efficient Deterministic Parallel Algorithm for Adaptive Multidimensional Numerical Integration on GPUs
Recent development in Graphics Processing Units (GPUs) has enabled a new possibility for highly efficient parallel computing in science and engineering. Their massively parallel architecture makes GPUs very effective for algorithms where processing of large blocks of data can be executed in parallel. Multidimensional integration has important applications in areas like computational physics, plasma physics, […]
Jul, 19
OpenCL API Extensions to achieve Multi-level Parallelism for Efficient Implementation of Strassen’s Matrix Multiplication on GPUs
Strassen’s matrix multiplication algorithm is an efficient and widely used practical algorithm for matrix multiplication. In its basic form, the algorithm is a series of recursive steps to decompose the matrices, multiply intermediate matrices and another set of recursive steps to recompose the product matrix. Implementing the algorithm on a GPU requires it to be […]
Jul, 19
HAccRG: Hardware-Accelerated Data Race Detection in GPUs
Modern Graphics Processing Units (GPUs) are capable of supporting thousands of concurrent threads. However, they provide relatively little guarantee with respect to the coherence and consistency of the memory system. Thus, GPUs are prone to multitude of concurrency bugs related to inconsistent memory states. Many such bugs manifest as some form of data races at […]
Jul, 19
Methods for Optimizing OpenCL Applications on Heterogeneous Multicore Architectures
Heterogeneous multicore architectures with CPU and add-on GPUs or streaming processors are now widely used in computer systems. These GPUs provide substantially more computation capability and memory bandwidth compared to traditional multi-cores. Also, because they are highly programmable, they provide the computational performance needed for realistic graphics rendering. Applications with general computations can also be […]

