LeftoverLocals: Listening to LLM Responses Through Leaked GPU Local Memory

Tyler Sorensen, Heidy Khlaaf
Trail of Bits, University of California, Santa Cruz
arXiv:2401.16603 [cs.CR], (29 Jan 2024)


   title={LeftoverLocals: Listening to LLM Responses Through Leaked GPU Local Memory},

   author={Tyler Sorensen and Heidy Khlaaf},






This paper describes LeftoverLocals: a vulnerability that allows data recovery from GPU memory created by another process on Apple, Qualcomm, and AMD GPUs. LeftoverLocals impacts the security posture of GPU applications, with particular significance to LLMs and ML models that run on impacted GPUs. By recovering local memory, an optimized GPU memory region, we built a PoC where an attacker can listen into another user’s interactive LLM session (e.g., llama.cpp) across process or container boundaries.
Rating: 5.0/5. From 1 vote.
Please wait...

Recent source codes

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: