A Survey of Cloud-Based GPU Threats and Their Impact on AI, HPC, and Cloud Computing
Trend Micro Inc.
Trend Micro Inc., 2024
@article{huq2024survey,
title={A Survey of Cloud-Based GPU Threats and Their Impact on AI, HPC, and Cloud Computing},
author={Huq, Numaan and Lin, Philippe and Reyes, Roel and Perine, Charles},
year={2024}
}
Graphics processing units (GPUs) are the hardware engines driving the AI revolution. Large language model (LLM)-powered generative AI (GenAI) became mainstream with the public release of OpenAI’s ChatGPT. AI usage has given rise to innovative AI-powered applications for businesses, productivity, image generation, video generation, data analysis, and social media, among others. Powering AI applications are GPUs, which are specialized microchips designed to accelerate computer graphics and image processing. GPUs are also useful for non-graphics tasks, especially parallel processing problems. GPUs are designed for parallel processing and can run thousands of simple compute tasks simultaneously, as opposed to a central processing unit (CPU), which is designed for a few complex tasks at a time. NVIDIA introduced a C-like programming language called Compute Unified Device Architecture (CUDA) that opened up GPU programming to developers. Similar to CUDA, AMD GPUs support a programming language called OpenCL, which aims to be a vendor-independent language for multiple platforms. ChatGPT is an LLM, which is a class of AI that helps generate human-like responses. LLMs are competent in many applications such as language translation, content generation, sentiment analysis, data analysis, chatbots, and more. LLMs are complex neural networks with vast numbers of interconnected nodes performing repeated calculations and adjustments. GPUs excel at the massive parallel computations that are fundamental to neural network training and inference. Training complex LLMs could take tens of years on CPUs, whereas GPUs reduce that period to a more manageable duration of months. Specialized GPUs like NVIDIA’s, which come equipped with Tensor Cores, are designed for matrix operations, which are extensively used in deep learning and LLMs. GPUs also have on-chip high-speed device memory that is crucial for processing the enormous training datasets. Similar to AI applications, GPUs are extensively used in high-performance computing (HPC). HPC tackles problems such as analyzing massive datasets and running complex simulations. These tasks have a high degree of parallelism, which makes them well-suited for GPUs. Simulations such as weather prediction, drug modeling, fluid dynamics, and protein folding require an immense number of calculations at each step, and GPUs accelerate this process. This allows researchers to experiment rapidly, test hypotheses, and gain valuable insights quicker. GPUs are often more energy-efficient than CPUs for HPC tasks, allowing more computation within the same power envelope. Many HPC applications integrate machine learning and deep learning, as combining these techniques leads to new scientific insights and accelerated discovery. We already discussed how GPUs are very good at processing artificial intelligence and machine learning (AI/ML) tasks, so using GPUs to solve tasks that involve both HPC and AI/ML is a natural choice. As AI training and processing as well as HPC applications become more important for businesses, they are switching to cloud-based GPUs versus an on-site setup. Cloud-based GPUs provide scalability and flexibility, which is great for bursty workloads, especially when there is a spike in the need for massive compute power followed by low usage periods. There is no upfront investment in expensive hardware and maintenance, as cloud services operate on a pay-as-you-go model. With cloud-based GPUs, users get access to the latest GPU chips available in the market. This is critical in fields like AI, where hardware advances greatly accelerate processing tasks. Another advantage is global access to shared GPU resources without worrying about hardware logistics. Given the increasing reliance on GPUs for everyday business tasks, this paper explores the security threats GPUs face and what actions can be taken to mitigate the risks. As the reliance on cloud-based GPU instances grows, so does the importance of ensuring their security. The threats are multifaceted, ranging from data breaches and unauthorized access to more sophisticated attacks like reading GPU memory. By examining these security challenges, this research aims to provide insights into the current threat landscape for cloud-based GPUs. It will discuss both the vulnerabilities inherent in these systems and strategies for protecting them against cyberattacks.
June 2, 2024 by hgpu