high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Cost-Performance Analysis: A Comparative Study of CPU-Based Serverless and GPU-Based Training Architectures

Cost-Performance Analysis: A Comparative Study of CPU-Based Serverless and GPU-Based Training Architectures

Amine Barrak, Fabio Petrillo, Fehmi Jaafar

Oakland University, Rochester, MI, USA

arXiv:2509.14920 [cs.DC], (18 Sep 2025)

DOI:10.48550/arXiv.2509.14920

@misc{barrak2025costperformanceanalysiscomparativestudy,

title={Cost-Performance Analysis: A Comparative Study of CPU-Based Serverless and GPU-Based Training Architectures},

author={Amine Barrak and Fabio Petrillo and Fehmi Jaafar},

year={2025},

eprint={2509.14920},

archivePrefix={arXiv},

primaryClass={cs.DC},

url={https://arxiv.org/abs/2509.14920}

}

Download (PDF)

View

Source

630

views

The field of distributed machine learning (ML) faces increasing demands for scalable and cost-effective training solutions, particularly in the context of large, complex models. Serverless computing has emerged as a promising paradigm to address these challenges by offering dynamic scalability and resource-efficient execution. Building upon our previous work, which introduced the Serverless Peer Integrated for Robust Training (SPIRT) architecture, this paper presents a comparative analysis of several serverless distributed ML architectures. We examine SPIRT alongside established architectures like ScatterReduce, AllReduce, and MLLess, focusing on key metrics such as training time efficiency, cost-effectiveness, communication overhead, and fault tolerance capabilities. Our findings reveal that SPIRT provides significant improvements in reducing training times and communication overhead through strategies such as parallel batch processing and in-database operations facilitated by RedisAI. However, traditional architectures exhibit scalability challenges and varying degrees of vulnerability to faults and adversarial attacks. The cost analysis underscores the long-term economic benefits of SPIRT despite its higher initial setup costs. This study not only highlights the strengths and limitations of current serverless ML architectures but also sets the stage for future research aimed at developing new models that combine the most effective features of existing systems.

Tags: Computer science, Databases, Machine learning, nVidia, Tesla T4

September 28, 2025 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Cost-Performance Analysis: A Comparative Study of CPU-Based Serverless and GPU-Based Training Architectures

Your response

Recent source codes

SeedFold: Scaling Biomolecular Structure Prediction

Tilus: A Tile-Level GPU Kernel Programming Language

Memory-Efficient Acceleration of Block Low-Rank Foundation Models on Resource Constrained GPUs

CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

BoltzGen:Toward Universal Binder Design

cuPilot: A Strategy-Coordinated Multi-agent Framework for CUDA Kernel Evolution

MATLAB Tensor Core models

TritonForge: Transform PyTorch Operations into Optimized GPU Kernels with LLMs

RLTune: Hybrid Learning and Optimization-Based Dynamic Scheduling for DL Workloads on Heterogeneous GPU Clusters

tritonBLAS: A Lightweight Triton-based General Matrix Multiplication (GEMM) Library

Most viewed papers (last 30 days)

Cost-Performance Analysis: A Comparative Study of CPU-Based Serverless and GPU-Based Training Architectures

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)