Compiler-based Data Prefetching and Streaming Non-temporal Store Generation for the Intel Xeon Phi Coprocessor

Rakesh Krishnaiyer, Emre Kultursay, Pankaj Chawla, Serguei Preis, Anatoly Zvezdin, Hideki Saito
Intel Corporation
Workshop on Multithreaded Architectures and Applications (MTAAP 2013), 2013


   title={Compiler-based Data Prefetching and Streaming Non-temporal Store Generation for the IntelR Xeon Phi TM Coprocessor},

   author={Krishnaiyer, Rakesh and K{"u}lt{"u}rsay, Emre and Chawla, Pankaj and Preis, Serguei and Zvezdin, Anatoly and Saito, Hideki},



Download Download (PDF)   View View   Source Source   



The Intel Xeon Phi coprocessor has software prefetching instructions to hide memory latencies and special store instructions to save bandwidth on streaming nontemporal store operations. In this work, we provide details on compiler-based generation of these instructions and evaluate their impact on the performance of the Intel Xeon Phi coprocessor using a wide range of parallel applications with different characteristics. Our results show that the Intel Composer XE 2013 compiler can make effective use of these mechanisms to achieve significant performance improvements.
No votes yet.
Please wait...

Recent source codes

* * *

* * *

HGPU group © 2010-2019 hgpu.org

All rights belong to the respective authors

Contact us: