14003

Using Butterfly-Patterned Partial Sums to Optimize GPU Memory Accesses for Drawing from Discrete Distributions

Guy L. Steele Jr. (Oracle Labs), Jean-Baptiste Tristan
Oracle Labs
arXiv:1505.03851 [cs.DC], (14 May 2015)

@article{.2015using,

   title={Using Butterfly-Patterned Partial Sums to Optimize GPU Memory Accesses for Drawing from Discrete Distributions},

   author={., Guy L. Steele Jr},

   year={2015},

   month={may},

   archivePrefix={"arXiv"},

   primaryClass={cs.DC}

}

Download Download (PDF)   View View   Source Source   

1868

views

We describe a technique for drawing values from discrete distributions, such as sampling from the random variables of a mixture model, that avoids computing a complete table of partial sums of the relative probabilities. A table of alternate ("butterfly-patterned") form is faster to compute, making better use of coalesced memory accesses. From this table, complete partial sums are computed on the fly during a binary search. Measurements using an NVIDIA Titan Black GPU show that for a sufficiently large number of clusters or topics (K > 200), this technique alone more than doubles the speed of a latent Dirichlet allocation (LDA) application already highly tuned for GPU execution.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: