Binary Interval Search (BITS): A Scalable Algorithm for Counting Interval Intersections
Department of Computer Science, University of Virginia, Charlottesville, VA
arXiv:1208.3407v1 [q-bio.GN] (16 Aug 2012)
@article{2012arXiv1208.3407L,
author={Layer}, R.~M. and {Skadron}, K. and {Robins}, G. and {Hall}, I.~M. and {Quinlan}, A.~R.},
title={"{Binary Interval Search (BITS): A Scalable Algorithm for Counting Interval Intersections}"},
journal={ArXiv e-prints},
archivePrefix={"arXiv"},
eprint={1208.3407},
primaryClass={"q-bio.GN"},
keywords={Quantitative Biology – Genomics},
year={2012},
month={aug},
adsurl={http://adsabs.harvard.edu/abs/2012arXiv1208.3407L},
adsnote={Provided by the SAO/NASA Astrophysics Data System}
}
MOTIVATION: The comparison of diverse genomic datasets is fundamental to understanding genome biology. Researchers must explore many large datasets of genome intervals (e.g., genes, sequence alignments) to place their experimental results in a broader context and to make new discoveries. Relationships between genomic datasets are typically measured by identifying intervals that intersect: that is, they overlap and thus share a common genome interval. Given the continued advances in DNA sequencing technologies, efficient methods for measuring statistically significant relationships between many sets of genomic features is crucial for future discovery. RESULTS: We introduce the Binary Interval Search (BITS) algorithm, a novel and scalable approach to interval set intersection. We demonstrate that BITS outperforms existing methods at counting interval intersections. Moreover, we show that BITS is intrinsically suited to parallel computing architectures such as Graphics Processing Units (GPUs) by illustrating its utility for efficient Monte-Carlo simulations measuring the significance of relationships between sets of genomic intervals.
August 17, 2012 by hgpu