12930

Random Address Permute-Shift Technique for the Shared Memory on GPUs

Koji Nakano, Susumu Matsumae, Yasuaki Ito
Department of Information Engineering, Hiroshima University
International Conference on Parallel Processing Workshops, pp. 429-438, 2014

@article{nakano2014random,

   title={Random Address Permute-Shift Technique for the Shared Memory on GPUs},

   author={Nakano, Koji and Matsumae, Susumu and Ito, Yasuaki},

   year={2014}

}

Download Download (PDF)   View View   Source Source   

1814

views

The Discrete Memory Machine (DMM) is a theoretical parallel computing model that captures the essence of memory access to the shared memory of a streaming multiprocessor on CUDA-enabled GPUs. The DMM has w memory banks that constitute a shared memory, and w threads in a warp try to access them at the same time. However, memory access requests destined for the same memory bank are processed sequentially. Hence, it is very important for developing efficient algorithms to reduce the memory access congestion, the maximum number of memory access requests destined for the same bank. The main contribution of this paper is to present a novel algorithmic technique called the random address permute-shift (RAP) technique that reduces the memory access congestion. We show that the RAP reduces the memory access congestion to O(log w/log log w) for any memory access requests including malicious ones by a warp of w threads. Also, we can guarantee that the congestion is 1 both for contiguous access and for stride access. The simulation results for w=32 show that the expected congestion for any memory access is only 3.53. Since the malicious memory access requests destined for the same bank take congestion 32, our RAP technique substantially reduces the memory access congestion. We have also applied the RAP technique to matrix transpose algorithms. The experimental results on GeForce GTX TITAN show that the RAP technique is practical and can accelerate a direct matrix transpose algorithm by a factor of 10.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: