7849

Posts

Jun, 17

GPUmotif: An Ultra-Fast and Energy-Efficient Motif Analysis Program Using Graphics Processing Units

Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. […]
Jun, 17

Branch and Data Herding: Reducing Control and Memory Divergence for Error-tolerant GPU Applications

Control and memory divergence between threads within the same execution bundle, or warp, have been shown to cause significant performance bottlenecks for GPU applications. In this paper, we exploit the observation that many GPU applications exhibit error tolerance to propose branch and data herding. Branch herding eliminates control divergence by forcing all threads in a […]
Jun, 16

ScatterAlloc: Massively Parallel Dynamic Memory Allocation for the GPU

In this paper, we analyze the special requirements of a dynamic memory allocator that is designed for massively parallel architectures such as Graphics Processing Units (GPUs). We show that traditional strategies, which work well on CPUs, are not well suited for the use on GPUs and present the thorough design of ScatterAlloc, which can efficiently […]
Jun, 16

E-MOGA: A General Purpose Platform for Multi Objective Genetic Algorithm running on CUDA

This paper introduces an Enhanced Multi Objective Genetic Algorithm (E-MOGA) running on Compute Unified Device Architecture (CUDA) hardware, as a general purpose tool that can solve conflict optimization problems. The tool demonstrates significant speed gains using affordable, scalable and commercially available hardware. The objectives of this research are: to enhance the general purpose Multi Objective […]
Jun, 16

Accelerating Lambert’s Problem on the GPU in MATLAB

The challenges and benefits of using the GPU to compute solutions to Lambert’s Problem are discussed. Three algorithms (Universal Variables, Gooding’s algorithm, and Izzo’s algorithm) were adapted for GPU computation directly within MATLAB. The robustness of each algorithm was considered, along with the speed at which it could be computed on each of three computers. […]
Jun, 16

Parallel Primitives based Spatial Join of Geospatial Data on GPGPUs

Modern GPU architectures closely resemble supercomputers. Commodity GPUs that have already been equipped with personal and cluster computers can be used to boost the performance of spatial databases and GIS. In this study, we report our preliminary work on designing and implementing a spatial join algorithm on GPUs by using generic parallel primitives that have […]
Jun, 16

GiST Scan Acceleration using Coprocessors

Efficient lookups in huge, possibly multi-dimensional datasets are crucial for the performance of numerous use cases that generate multiple search operations at the same time, like point queries in ray tracing or spatial joins in collision detection of interactive 3D applications. These applications greatly benefit from index structures that quickly filter relevant candidates for further […]
Jun, 15

Energy Efficiency Analysis of GPUs

In the last few years, Graphics Processing Units (GPUs) have become a great tool for massively parallel computing. GPUs are specifically designed for throughput and face several design challenges, specially what is known as the Power and Memory Walls. In these devices, available resources should be used to enhance performance and throughput, as the performance […]
Jun, 14

SAGA: SystemC Acceleration on GPU Architectures

SystemC is a widespread language for HW/SW system simulation and design exploration, and thus a key development platform in embedded system design. However, the growing complexity of SoC designs is having an impact on simulation performance, leading to limited SoC exploration potential, which in turns affects development and verification schedules and time-to-market for new designs. […]
Jun, 14

Performance Gains in Conjugate Gradient Computation with Linearly Connected GPU Multiprocessors

Conjugate gradient is an important iterative method used for solving least squares problems. It is compute-bound and generally involves only simple matrix computations. One would expect that we could fully parallelize such computation on the GPU architecture with multiple Stream Multiprocessors (SMs), each consisting of many SIMD processing units. While implementing a conjugate gradient method […]
Jun, 14

Exploiting Unexploited Computing Resources for Computational Logics

We present an investigation of the use of GPGPU techniques to parallelize the execution of a satisfiability solver, based on the traditional DPLL procedure – which, in spite of its simplicity, still represents the core of the most competitive solvers. The investigation tackles some interesting problems, including the use of a predominantly data-parallel architecture, like […]
Jun, 14

Parakeet: A Just-In-Time Parallel Accelerator for Python

High level productivity languages such as Python or Matlab enable the use of computational resources by nonexpert programmers. However, these languages often sacrifice program speed for ease of use. This paper proposes Parakeet, a library which provides a just-in-time (JIT) parallel accelerator for Python. Parakeet bridges the gap between the usability of Python and the […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: