8336

Posts

Sep, 23

Adaptive Treelet Meshes for Efficient Streak-Surface Visualization on the GPU

We describe a novel adaptive mesh representation for streak-surfaces. The surface is represented as a mesh of small trees of initial depth zero (treelets). This mesh representation allows for efficient integration, refinement, coarsening and appending of surface patches utilizing the computational capacities of modern GPUs. Integration, refinement, and rendering are strictly separated into effectively parallelizable […]
Sep, 23

Task Performance with List-Mode Data

This dissertation investigates the application of list-mode data to detection, estimation, and image reconstruction problems, with an emphasis on emission tomography in medical imaging. We begin by introducing a theoretical framework for list-mode data and we use it to define two observers that operate on list-mode data. These observers are applied to the problem of […]
Sep, 23

Computer Vision Application in Graphic Processors

Largely driven by the gaming industry, research and development of hardware tools for the generation of images, such as graphics cards (or GPU, Graphics Processing Units), experienced a tremendous growth in recent years. The increased power and flexibility and the low price of these GPUs have resulted in unexpected use in areas other than graphics. […]
Sep, 23

A Quantitative Study of Irregular Programs on GPUs

GPUs have been used to accelerate many regular applications and, more recently, irregular applications in which the control flow and memory access patterns are data-dependent and statically unpredictable. This paper defines two measures of irregularity called control-flow irregularity and memory-access irregularity, and investigates, using performance-counter measurements, how irregular GPU kernels differ from regular kernels with […]
Sep, 22

Computing of high breakdown regression estimators without sorting on graphics processing units

We present an approach to computing high-breakdown regression estimators in parallel on graphics processing units (GPU). We show that sorting the residuals is not necessary, and it can be substituted by calculating the median. We present and compare various methods to calculate the median and order statistics on GPUs. We introduce an alternative method based […]
Sep, 22

SpMV: A Memory-Bound Application on the GPU Stuck Between a Rock and a Hard Place

In this paper, we investigate the relative merits between GPGPUs and multicores in the context of sparse matrix-vector multiplication (SpMV). While GPGPUs possess impressive capabilities in terms of raw compute throughput and memory bandwidth, their performance varies significantly with application tuning as well as sparse input and format characteristics. Furthermore, several emerging technological and workload […]
Sep, 22

Overlapping computation and communication of three-dimensional FDTD on a GPU cluster

Large-scale electromagnetic field simulations using the FDTD (finite-difference time-domain) method require the use of GPU (graphics processing unit) clusters. However, the communication overhead caused by slow interconnections becomes a major performance bottleneck. In this paper, as a way to remove the bottleneck, we propose the "kernel-split method" and the "host-buffer method" which overlap computation and […]
Sep, 22

Exploration of Parallelization Frameworks for Computational Finance

This paper presents a comparison of parallelization frameworks for efficient execution of computational finance workloads. We use a Value-at-Risk (VaR) workload to evaluate OpenCL and OpenMP parallelization frameworks on multi-core CPUs as opposed to GPUs. In addition, we study the impact of SMT on performance using GCC (4.4) and IBM XLC (11.01) compilers for both […]
Sep, 22

Modification of self-organizing migration algorithm for OpenCL framework

This paper deals with modification of self-organizing migration algorithm using the OpenCL framework. This modification allows the algorithm to exploit modern parallel devices, like central processing units and graphics processing units. The main aim was to create algorithm which shows significant speedup when compared to sequential variant. Second aim was to create the algorithm robust […]
Sep, 21

Large-Scale Motion Modelling using a Graphical Processing Unit

The increased availability of Graphical Processing Units (GPUs) in personal computers has made parallel programming worthwhile and more accessible, but not necessarily easier. This thesis will take advantage of the power of a GPU, in conjunction with the Central Processing Unit (CPU), in order to simulate target trajectories for large-scale scenarios, such as wide-area maritime […]
Sep, 21

Some examples of instant computations of fluid dynamics on GPU

This paper is a summary of our experience feedback on GPU and GPGPU computing for two-dimensional computational fluid dynamics using fine grids and three-dimensional kinetic transport problems. The choice of the computational approach is clearly critical for both performance speedup and efficiency. In our numerical experiments, we used a Lattice Boltzmann approach (LBM) for the […]
Sep, 21

Parallelization of Hierarchical Text Clustering on Multi-core CUDA Architecture

Text Clustering is the problem of dividing text documents into groups, such that documents in same group are similar to one another and different from documents in other groups. Because of the general tendency of texts forming hierarchies, text clustering is best performed by using a hierarchical clustering method. An important aspect while clustering large […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: