8270

Posts

Sep, 12

Dynamical heterogeneities as fingerprints of a backbone structure in Potts models

We investigate slow non-equilibrium dynamical processes in two-dimensional q-state Potts model with both ferromagnetic and $pm J$ couplings. Dynamical properties are characterized by means of the mean-flipping time distribution. This quantity is known for clearly unveiling dynamical heterogeneities. Using a two-times protocol we characterize the different time scales observed and relate them to growth processes […]
Sep, 11

A First Step Towards GPU-assisted Query Optimization

Modern graphics cards bundle high-bandwidth memory with a massively parallel processor, making them an interesting platform for running data-intensive operations. Consequently, several authors have discussed accelerating database operators using graphics cards, often demonstrating promising speed-ups. However, due to limitations stemming from limited device memory and expensive data transfer, GPUaccelerated databases remain a niche technology. We […]
Sep, 11

Assembly-Free Large-Scale Modal Analysis on the GPU

Popular eigen-solvers such as block-Lanczos require repeated inversion of an eigen-matrix. This is a bottleneck in large-scale modal problems with millions of degrees of freedom. On the other hand, the classic Rayleigh-Ritz conjugate gradient method only requires a matrix-vector multiplication, and is therefore potentially scalable to such problems. However, as is well-known, the Rayleigh-Ritz has […]
Sep, 11

Design and Modeling of a Non-blocking Checkpointing System

As the capability and component count of systems increase, the MTBF decreases. Typically, applications tolerate failures with checkpoint/restart to a parallel file system (PFS). While simple, this approach can suffer from contention for PFS resources. Multi-level checkpointing is a promising solution. However, while multi-level checkpointing is successful on todays machines, it is not expected to […]
Sep, 11

GPU accelerated QTL detection

Mapping quantitative trait loci (QTL) using genetic marker information is a time consuming analysis that has interested the mapping community for the past decades. The increasing amount of genetic marker data allows one to consider ever more precise QTL analyses, while increasing the demand for computation. Part of the difficulty of detecting QTLs resides in […]
Sep, 11

High-Performance Location-Aware Publish-Subscribe on GPUs

Adding location-awareness to publish-subscribe middleware infrastructures would open-up new opportunities to use this technology in the hot area of mobile applications. On the other hand, this requires to radically change the way published events are matched against received subscriptions. In this paper we examine this issue in detail and we present CLCB, a new algorithm […]
Sep, 10

CuNeuQuant: A CUDA Implementation of the NeuQuant Image Quantization Algorithm

Color quantization is an often performed prestep in many image processing and computer vision applications. Quantization is defined as the process of selecting a palette of representative colors P which can replace the original colors C in an image such that |P| << |C| and the perceptual distortion of the reduced color image is minimized. […]
Sep, 10

GPU-Accelerated Monte Carlo Simulations of Dense Stellar Systems

Computing the interactions between the stars within dense stellar clusters is a problem of fundamental importance in theoretical astrophysics. However, simulating realistic sized clusters of about 106 stars is computationally intensive and often takes a long time to complete. This paper presents the acceleration of a Monte Carlo algorithm for simulating stellar cluster evolution using […]
Sep, 10

A Parallel Twig Join Algorithm for XML Processing using a GPGPU

With an increasing amount of data and demand for fast query processing, the efficiency of database operations continues to be a challenging task. A common approach is to leverage parallel hardware platforms. With the introduction of general-purpose GPU (Graphics Processing Unit) computing, massively parallel hardware has become available within commodity hardware. XML is based on […]
Sep, 10

Accelerating Boosting-based Face Detection on GPUs

The goal of face detection is to determine the presence of faces in arbitrary images, along with their locations and dimensions. As it happens with any graphics workloads, these algorithms benefit from data-level parallelism. Existing parallelization efforts strictly focus on mapping different divide and conquer strategies into multicore CPUs and GPUs. However, even the most […]
Sep, 10

Performance Improvement of TOUGH2 Simulation with Graphics Processing Unit

We tried to accelerate the computational speed of TOUGH2 simulation by introducing a linear computation routine using a Graphics Processing Unit (GPU). Libraries for GPU computation were introduced, and new solvers for linear equations were developed. Out of those, CLLUSTB, an ILU preconditioned BiCGSTAB solver made with the CULA Sparse, demonstrated good performance both in […]
Sep, 8

Performance Evaluation of Concurrent Lock-free Data Structures on GPUs

Graphics processing units (GPUs) have emerged as a strong candidate for high-performance computing. While regular data-parallel computations with little or no synchronization are easy to map on the GPU architectures, it is a challenge to scale up computations on dynamically changing pointer-linked data structures. The traditional lock-based implementations are known to offer poor scalability due […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org