28356

Improving Performance of Iterative Applications through Interleaved Execution of Approximated CUDA Kernels

Gabriel Freytag
Universidade Federal do Rio Grande do Sul, Instituto de Informática
Universidade Federal do Rio Grande do Sul, 2023

@article{freytag2023improving,

   title={Improving performance of iterative applications through interleaved execution of approximated CUDA kernels},

   author={Freytag, Gabriel},

   year={2023}

}

Download Download (PDF)   View View   Source Source   

566

views

Approximate computing techniques, particularly those involving reduced and mixed precision, are widely studied in literature to accelerate applications and reduce energy consumption. Although many researchers analyze the performance, accuracy loss, and energy consumption of a wide range of application domains, few evaluate approximate computing techniques in iterative applications. These applications rely on the result of the computations of previous iterations to perform subsequent iterations, making them sensitive to precision errors that can propagate and magnify throughout the execution. Additionally, monitoring the accuracy loss of the execution in large datasets is challenging. Calculating accuracy loss at runtime is computationally expensive and becomes infeasible in applications with a considerable volume of data. This thesis presents a methodology for generating interleaved execution configurations of multiple kernel versions for iterative applications on GPUs. The methodology involves sampling the accuracy loss profile, extracting performance and accuracy loss statistics, and offline generating interleaved execution configurations of kernel versions for different thresholds of accuracy loss. The experiments conducted on three iterative applications of physical simulation in three-dimensional data domains demonstrated the capability of the methodology to extract performance and accuracy loss statistics and generate interleaved execution configurations of kernel versions with speedups up to 2 and reduction of energy consumption up to 60%. For future work, we suggest studying different optimization strategies for generating interleaved execution configurations of kernel versions, such as using neural networks and machine learning.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: