high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » A massively parallel algorithm for constructing the BWT of large string sets

A massively parallel algorithm for constructing the BWT of large string sets

Jacopo Pantaleoni

arXiv:1410.0562 [cs.DS], (2 Oct 2014)

BibTeX

Download (PDF)

View

Source

1786

views

We present a new scalable, lightweight algorithm to incrementally construct the BWT and FM-index of large string sets such as those produced by Next Generation Sequencing. The algorithm is designed for massive parallelism and can effectively exploit the combination of low capacity high bandwidth memory and slower external system memory typical of GPU accelerated systems. Particularly, for a string set of n characters from an alphabet with sigma symbols, it uses a constant amount of high-bandwidth memory and at most 3n log(sigma) bits of system memory. Given that deep memory hierarchies are becoming a pervasive trait of high performance computing architectures, we believe this to be a relevant feature. The implementation can handle reads of arbitrary length and is up to 2 and respectively 6.5 times faster than state-of-the-art for short and long genomic reads.

Tags: Algorithms, Bioinformatics, Computer science, CUDA, nVidia, Tesla K40

October 4, 2014 by hgpu

Rating: 0.5/5. From 1 vote.

Please wait...

Your response

You must be logged in to post a comment.

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

A massively parallel algorithm for constructing the BWT of large string sets

Your response

Recent source codes

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

Most viewed papers (last 30 days)

A massively parallel algorithm for constructing the BWT of large string sets

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)