13794

Posts

Mar, 30

Accelerating complex brain-model simulations on GPU platforms

The Inferior Olive (IO) in the brain, in conjunction with the cerebellum, is responsible for crucial sensorimotor-integration functions in humans. In this paper, we simulate a computationally challenging IO neuron model consisting of three compartments per neuron in a network arrangement on GPU platforms. Several GPU platforms of the two latest NVIDIA GPU architectures (Fermi, […]
Mar, 30

High Performance Computing for solving large sparse systems. Optical Diffraction Tomography as a case of study

This thesis, entitled "High Performance Computing for solving large sparse systems. Optical Diffraction Tomography as a case of study" investigates the computational issues related to the resolution of linear systems of equations which come from the discretization of physical models described by means of Partial Differential Equations (PDEs). These physical models are conceived for the […]
Mar, 30

Real-time multi-view deconvolution

In light-sheet microscopy, overall image content and resolution are improved by acquiring and fusing multiple views of the sample from different directions. State-of-the-art multi-view (MV) deconvolution employs the point spread functions (PSF) of the different views to simultaneously fuse and deconvolve the images in 3D, but processing takes a multiple of the acquisition time and […]
Mar, 28

Loo.py: From Fortran to performance via transformation and substitution rules

A large amount of numerically-oriented code is written and is being written in legacy languages. Much of this code could, in principle, make good use of data-parallel throughput-oriented computer architectures. Loo.py, a transformation-based programming system targeted at GPUs and general data-parallel architectures, provides a mechanism for user-controlled transformation of array programs. This transformation capability is […]
Mar, 28

Parallel Unsteady Flow Line Integral Convolution for High-Performance Dense Visualization

This paper presents an accurate parallel implementation of unsteady flow line integral convolution (UFLIC) for high-performance visualization of large time-varying flows. Our approach differs from previous implementations by using a novel value scattering+gathering mechanism to parallelize UFLIC and designing a pathline reuse strategy to reduce the computational cost of pathline integration. By exploiting the massive […]
Mar, 28

Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space

When computing alignments of DNA sequences to a large genome, a key element in achieving high processing throughput is to prioritize locations in the genome where high-scoring mappings might be expected. We formulated this task as a series of list-processing operations that can be efficiently performed on graphics processing unit (GPU) hardware.We followed this approach […]
Mar, 28

Shortest-Path Queries in Planar Graphs on GPU-Accelerated Architectures

We develop an efficient parallel algorithm for answering shortest-path queries in planar graphs and implement it on a multi-node CPU/GPU clusters. The algorithm uses a divide-and-conquer approach for decomposing the input graph into small and roughly equal subgraphs and constructs a distributed data structure containing shortest distances within each of those subgraphs and between their […]
Mar, 28

PErasure: a Parallel Cauchy Reed-Solomon Coding Library for GPUs

In recent years, erasure coding has been adopted by large-scale cloud storage systems to replace data replication. With the increase of disk I/O throughput and network bandwidth, the speed of erasure coding becomes one of the key system bottlenecks. In this paper, we propose to offload the task of erasure coding to Graphics Processing Units […]
Mar, 25

Pseudorandom Numbers Generation for Monte Carlo Simulations on GPUs: OpenCL Approach

General principles of pseudorandom numbers production for Monte Carlo simulations on GPUs are discussed by creating an OpenCL open-source library of pseudorandom number generators PRNGCL. The library contains implementation of a number of the most popular uniform generators. The most popular pseudorandom number generators for Monte Carlo simulations and libraries for GPUs are reviewed. Some […]
Mar, 25

Energy-efficient Computing on Distributed GPUs using Dynamic Parallelism and GPU-controlled Communication

GPUs are widely used in high performance computing, due to their high computational power and high performance per Watt. Still, one of the main bottlenecks of GPU-accelerated cluster computing is the data transfer between distributed GPUs. This not only affects performance, but also power consumption. The most common way to utilize a GPU cluster is […]
Mar, 25

Analysis of illumination conditions at the lunar south pole using parallel computing techniques

In this Master Thesis an analysis of illumination conditions at the lunar south pole using parallel computing techniques is presented. Due to the small inclination (1.54o) of the lunar rotational axis with respect to the ecliptic plane and the topography of the lunar south pole, which allows long illumination periods, the study of illumination conditions […]
Mar, 25

The Feasibility of Using OpenCL Instead of OpenMP for Parallel CPU Programming

OpenCL, along with CUDA, is one of the main tools used to program GPGPUs. However, it allows running the same code on multi-core CPUs too, making it a rival for the long-established OpenMP. In this paper we compare OpenCL and OpenMP when developing and running compute-heavy code on a CPU. Both ease of programming and […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org