5301

Parallel Fast Gauss Transform

Rahul S. Sampath, Hari Sundar, Shravan K. Veerapaneni
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’10, 2010

@inproceedings{sampath2010parallel,

   title={Parallel Fast Gauss Transform},

   author={Sampath, R.S. and Sundar, H. and Veerapaneni, S.K.},

   booktitle={Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis},

   pages={1–10},

   year={2010},

   organization={IEEE Computer Society}

}

Download Download (PDF)   View View   Source Source   

1340

views

We present fast adaptive parallel algorithms to compute the sum of N Gaussians at N points. Direct sequential computation of this sum would take $O(N^2)$ time. The parallel time complexity estimates for our algorithms are $O(N/np)$ for uniform point distributions and $O(N/np log N/np + nplognp)$ for nonuniform distributions using np CPUs. We incorporate a planewave representation of the Gaussian kernel which permits "diagonal translation". We use parallel octrees and a new scheme for translating the plane-waves to efficiently handle nonuniform distributions. Computing the transform to six-digit accuracy at 120 billion points took approximately 140 seconds using 4096 cores on the Jaguar supercomputer at the Oak Ridge National Laboratory. Our implementation is kernel-independent and can handle other "Gaussian-type" kernels even when an explicit analytic expression for the kernel is not known. These algorithms form a new class of core computational machinery for solving parabolic PDEs on massively parallel architectures.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: