13406

Posts

Jan, 21

Global finite element matrix construction based on a CPU-GPU implementation

The finite element method (FEM) has several computational steps to numerically solve a particular problem, to which many efforts have been directed to accelerate the solution stage of the linear system of equations. However, the finite element matrix construction, which is also time-consuming for unstructured meshes, has been less investigated. The generation of the global […]
Jan, 19

A Survey of Architectural Techniques For Improving Cache Power Efficiency

Modern processors are using increasingly larger sized on-chip caches. Also, with each CMOS technology generation, there has been a significant increase in their leakage energy consumption. For this reason, cache power management has become a crucial research issue in modern processor design. To address this challenge and also meet the goals of sustainable computing, researchers […]
Jan, 19

Harnessing Aspect Oriented Programming on GPU: Application to Warp-Level Parallelism (WLP)

Stochastic simulations involve multiple replications in order to build confidence intervals for their results, and Designs Of Experiments (DOEs) to explore their parameters set. In this paper, we propose Warp-Level Parallelism (WLP), a GPU-enabled solution to compute Multiple Replications In Parallel (MRIP) on GPUs (Graphics Processing Units). GPUs are intrinsically tuned to process efficiently the […]
Jan, 19

Accelerating mahout on heterogeneous clusters using HadoopCL

MapReduce is a programming model capable of processing massive data in parallel across hundreds of computing nodes in a cluster. It hides many of the complicated details of parallel computing and provides a straightforward interface for programmers to adapt their algorithms to improve productivity. Many MapReduce-based applications have utilized the power of this model, including […]
Jan, 19

A fast marching method based back projection algorithm for photoacoustic tomography in heterogeneous media

This paper presents a numerical study on a fast marching method based back projection reconstruction algorithm for photoacoustic tomography in heterogeneous media. Transcranial imaging is used here as a case study. To correct for the phase aberration from the heterogeneity (i.e., skull), the fast marching method is adopted to compute the phase delay based on […]
Jan, 19

Hybrid Multicore Algorithms for Some Semi-Numerical Applications and Graphs

The computing industry has undergone several paradigm shifts in the last few decades. Fueled by the need of faster computing, larger data and real time processing needs parallel computing has emerged as one of the dominant paradigms. Motivated by the success achieved in distributed computing models and the limitations faced by single core processors, parallel […]
Jan, 19

Indexing of Spatiotemporal Trajectories for Efficient Distance Threshold Similarity Searches on the GPU

Applications in many domains search moving object trajectory databases. The distance threshold search finds all trajectories within a given distance of a query trajectory. We develop three GPU distance threshold search implementations that use indexing techniques significantly different from those used in CPU implementations. We determine experimentally under which conditions each approach performs well using […]
Jan, 16

CURFIL: Random Forests for Image Labeling on GPU

Random forests are popular classifiers for computer vision tasks such as image labeling or object detection. Learning random forests on large datasets, however, is computationally demanding. Slow learning impedes model selection and scientific research on image features. We present an open-source implementation that significantly accelerates both random forest learning and prediction for image labeling of […]
Jan, 16

SW#db: GPU-accelerated exact sequence similarity database search

The deluge of next-generation sequencing (NGS) data and expanding database poses higher requirements for protein similarity search. State-of-the-art tools such as BLAST are not fast enough to cope with these requirements. Because of that it is necessary to create new algorithms that will be faster while keeping similar sensitivity levels. The majority of protein similarity […]
Jan, 16

Parallel Algorithms for Counting Problems on Graphs Using Graphics Processing Units

The availability of Graphics Processing Units (GPUs) with multicore architecture have enabled parallel computations using extensive multi-threading. Recent advancements in computer hardware have led to the usage of graphics processors for solving general purpose problems. Using GPUs for computation is a highly efficient and low-cost alternative as compared to currently available multicore Central Processing Units […]
Jan, 16

GPU Processing for UAS-Based LFM-CW Stripmap SAR

Unmanned air systems (UAS) provide an excellent platform for synthetic aperture radar (SAR), enabling surveillance and research over areas too difficult, dangerous, or costly to reach using manned aircraft. However, the nimble nature of the small UAS makes them more susceptible to external forces, thus requiring significant motion compensation in order for SAR images to […]
Jan, 16

Parallel Implementation of the Finite Element Method on Graphics Processors for the Solution of Incompressible Flows

In recent years clock speeds and memory bandwidths of Graphics Processing Units (GPUs) increased dramatically compared to CPUs. Also GPU vendors developed and freely released new programming tools to make scientific computing on GPUs easier. With these recent developments the use of GPUs for general purpose computing becomes a popular research field. Researchers previously demonstrated […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: