7130

Posts

Jan, 26

Using a GPU, Online Diarization – Offline Diarization

This article presents a low-latency, online speaker diarization system ("who is speaking now?") based on the repeated execution of a GPU-optimized, highly efficient offline diarization system ("who spoke when"). The system fulfills all requirements of the diarization task, i.e., it does not require any a priori information about the input, including specific speaker models. In […]
Jan, 26

Parallel Symbolic Analysis of Large Analog Circuits on GPU Platforms

Graph-based symbolic technique is a viable tool for calculating the behavior or the characterization of an analog circuit. Traditional symbolic analysis tools typically are used to calculate the behavior or the characteristic of a circuit in terms of symbolic parameters (Gielen et al., 1994). The introduction of determinant decision diagrams based symbolic analysis technique allows […]
Jan, 26

Visual Data Mining Using the Point Distribution Tensor

We explore a novel algorithm to analyze arbitrary distributions of 3D-points. Using a direct tensor field visualization technique allows to easily identify regions of linear, planar or isotropic structure. This approach is very suitable for visual data mining and exemplified upon geoscience applications. It allows to distinguish, for example, power lines and flat terrains in […]
Jan, 25

Scalable Parallel Minimum Spanning Forest Computation

The proliferation of data in graph form calls for the development of scalable graph algorithms that exploit parallel processing environments. One such problem is the computation of a graph’s minimum spanning forest (MSF). Past research has proposed several parallel algorithms for this problem, yet none of them scales to large, high-density graphs. In this paper […]
Jan, 25

Parallel LDPC Decoder Implementation on GPU Based on Unbalanced Memory Coalescing

We consider flexible decoder implementation of low density parity check (LDPC) codes via compute-unified-devicearchitecture (CUDA) programming on graphics processing unit (GPU), a research subject of considerable recent interest. An important issue in LDPC decoder design based on CUDA-GPU is realizing coalesced memory access, a technique that reduces memory transaction time considerably. In previous works along […]
Jan, 25

Multifrontal Sparse Matrix Factorization on Graphics Processing Units

For many finite element problems, when represented as sparse matrices, iterative solvers are found to be unreliable because they can impose computational bottlenecks. Early pioneering work by Duff et al, explored an alternative strategy called multifrontal sparse matrix factorization. This approach, by representing the sparse problem as a tree of dense systems, maps well to […]
Jan, 25

TAP: A TLP-Aware Cache Management Policy for a CPU-GPU Heterogeneous Architecture

Combining CPUs and GPUs on the same chip has become a popular architectural trend. However, these heterogeneous architectures put more pressure on shared resource management. In particular, managing the lastlevel cache (LLC) is very critical to performance. Lately, many researchers have proposed several shared cache management mechanisms, including dynamic cache partitioning and promotion-based cache management, […]
Jan, 25

Parallel Algorithm Design and Implementation of Regular/Irregular Problems: An In-depth Performance Study on Graphics Processing Units

Recently, interest in the Graphics Processing Unit (GPU) for general purpose parallel applications development and research has grown. Much of the current research on the GPU focuses on the acceleration of regular problems, as irregular problems typically do not provide the same level of performance on the hardware. We explore the potential of the GPU […]
Jan, 25

PyCOOL – a Cosmological Object-Oriented Lattice code written in Python

There are a number of different phenomena in the early universe that have to be studied numerically with lattice simulations. This paper presents a graphics processing unit (GPU) accelerated Python program called PyCOOL that solves the evolution of scalar fields in a lattice with very precise symplectic integrators. The program has been written with the […]
Jan, 25

Realtime scheduling using GPUs – proof of feasibility

This paper will report our evaluation to use openCL as a platform for hard realtime scheduling. Specifically, we have evaluated which types of tasks are faster on GPGPU than on CPU. We have investigated computational tasks, memory intensive tasks (especially tasks using low latency GDDR memory) and disk intensive tasks. This study is the first […]
Jan, 25

GPU algorithms for comparison-based sorting and for merging based on multiway selection

Sorting and merging are two important kernels which are used as subroutines in numerous algorithms, whose performance depends on the efficiency of these primitives. Databases use sort and merge primitives extensively. Computational biology, search engines, realtime rendering and geographical information systems are other fields where sorting and merging large amounts of data is indispensable. Even […]
Jan, 25

Computational Fluid Dynamics using OpenCL – a Practical Introduction

The main aim of the Computational Fluid Dynamics (CFD) simulations is to reconstruct the reality of fluid motion and behaviour as accurately as possible in order to better understand the natural phenomena under specified conditions. Ideally, general solutions can also cover different scales and geometric configurations. Unfortunately, due to expensive algorithms, classic CFD codes most […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: