11973

Posts

Apr, 27

Self-Adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures

Based on the premise that preconditioners needed for scientific computing are not only required to be robust in the numerical sense, but also scalable for up to thousands of light-weight cores, we argue that this two-fold goal is achieved for the recently developed self-adaptive multi-elimination preconditioner. For this purpose, we revise the underlying idea and […]
Apr, 27

Scattering Parameters and Surface Normals from Homogeneous Translucent Materials using Photometric Stereo

This paper proposes a novel photometric stereo solution to jointly estimate surface normals and scattering parameters from a globally planar, homogeneous, translucent object. Similar to classic photometric stereo, our method only requires as few as three observations of the translucent object under directional lighting. Naively applying classic photometric stereo results in blurred photometric normals. We […]
Apr, 27

Cellular GPU Models to Euclidean Optimization Problems

The work presented in this PhD studies and proposes cellular computation parallel models able to address different types of NP-hard optimization problems defined in the Euclidean space, and their implementation on the Graphics Processing Unit (GPU) platform. The goal is to allow both dealing with large size problems and provide substantial acceleration factors by massive […]
Apr, 25

On the Parallelization of Integer Polynomial Multiplication

With the advent of hardware accelerator technologies, multi-core processors and GPUs, much effort for taking advantage of those architectures by designing parallel algorithms has been made. To achieve this goal, one needs to consider both algebraic complexity and parallelism, plus making efficient use of memory traffic, cache, and reducing overheads in the implementations. Polynomial multiplication […]
Apr, 25

A new way in few-body scattering calculations: discretized Faddeev equations solved on GPU

A new approach towards very fast and economic few-body scattering calculations is described. The general method is realized on three steps: (i) reformulation of the scattering equations using the convenient analytical form for the channel resolvent operator; (ii) the complete few-body continuum discretization and projection of all operators and wave functions onto the $L_2$ type […]
Apr, 25

Integrating multi-threading and accelerators into DUNE-ISTL

A major challenge in PDE software is the balance between user-level flexibility and performance on heterogeneous hardware. We discuss our ideas on how this challenge can be tackled, exemplarily for the DUNE framework and in particular its linear algebra and solver components. We demonstrate how the former MPI-only implementation is modified to support MPI+[CPU/GPU] threading […]
Apr, 25

One weird trick for parallelizing convolutional neural networks

I present a new way to parallelize the training of convolutional neural networks across multiple GPUs. The method scales significantly better than all alternatives when applied to modern convolutional neural networks.
Apr, 25

Neural Decoding using a Parallel Sequential Monte Carlo method on Point Processes with Ensemble Effect

Sequential Monte Carlo estimation on point processes has been successfully applied to predict the movement from neural activity. However, there exist some issues along with this method such as the too simplified tuning model and the high computational complexity. In this paper, we attempt to address these issues and improve its decoding performance. Firstly, a […]
Apr, 24

Automating a Labour Performance Measurement and Risk Assessment: An Evaluation of Methods for a Computer Vision based System

This thesis brings together productivity and risk assessments through innovative design, development and evaluation of a unique system for retrieving and analysing data. In the past, although the link between them is well-documented, these assessments have largely been dealt with as separate antagonist entities. A broad evaluation of the existing traditional and technological support systems […]
Apr, 24

Parallel Computational Intelligence-Based Multi-Camera Surveillance System

In this work, we present a multi-camera surveillance system based on the use of self-organizing neural networks to represent events on video. The system processes several tasks in parallel using GPUs (graphic processor units). It addresses multiple vision tasks at various levels, such as segmentation, representation or characterization, analysis and monitoring of the movement. These […]
Apr, 24

Discrete Planning Unit Look-ahead Velocity Control Strategy and Parallelization Research based on GPU

High-velocity and high-accuracy are the development direction of the numerical control technology. During the machining of complicated curves and surfaces, by CAD/CAM software, the massive micro-segments are generated. Then the micro-segments are inputted into numerical control system (CNC) to process the velocity planning and high-velocity interpolation. This whole procedure is the core algorithm of CNC. […]
Apr, 24

A GPU Framework for Sparse Matrix Vector Multiplication

The hardware and software evolutions related to Graphics Processing Units (GPUs), for general purpose computations, have changed the way the parallel programming issues are addressed. Many applications are being ported onto GPU for achieving performance gain. The GPU execution time is continuously optimized by the GPU programmers while optimizing pre-GPU computation overheads attracted the research […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org