high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » SOL: Reducing the Maintenance Overhead for Integrating Hardware Support into AI Frameworks

SOL: Reducing the Maintenance Overhead for Integrating Hardware Support into AI Frameworks

Nicolas Weber

NEC Laboratories Europe

arXiv:2205.10357 [cs.LG], (19 May 2022)

DOI:10.48550/arXiv.2205.10357

BibTeX

Download (PDF)

View

Source

5373

views

The increased interest in Artificial Intelligence (AI) raised the need for highly optimized and sophisticated AI frameworks. Starting with the Lua-based Torch many frameworks have emerged over time, such as Theano, Caffe, Chainer, CNTK, MxNet, PyTorch, DL4J, or TensorFlow. All of these provide a high level scripting API that allows users to easily design neural networks and run these on various kinds of hardware. What the user usually does not see is the high effort put into these frameworks to provide peak execution performance. While mainstream CPUs and GPUs have the "luxury" to have a wide spread user base in the open source community, less mainstream CPU, GPU or accelerator vendors need to put in a high effort to get their hardware supported by these frameworks. This includes not only the development of highly efficient compute libraries such as CUDNN, OneDNN or VEDNN but also supporting an ever growing number of simpler compute operations such as summation and multiplications. Each of these frameworks, nowadays, supports several hundred of unique operations, with tensors of various sizes, shapes and data types, which end up in thousands of compute kernels required for each device type. And the number of operations keeps increasing. That is why NEC Laboratories Europe started developing the SOL AI Optimization project already years ago, to deliver optimal performance to users while keeping the maintenance burden minimal.

Tags: AI, Artificial intelligence, Computer science, CUDA, Machine learning, Neural networks, nVidia, nVidia GeForce RTX 2080, Performance

May 29, 2022 by hgpu

Rating: 4.0/5. From 1 vote.

Please wait...

* * *

high performance computing on graphics processing units: hgpu.org

SOL: Reducing the Maintenance Overhead for Integrating Hardware Support into AI Frameworks

Recent source codes

XaaS containers

microSYCL: SYCL micro-benchmarks repository

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

PELSI: Power-Efficient Layer-Switched Inference

Ouroboros: Virtualized Queues for dynamic memory management

Most viewed papers (last 30 days)

SOL: Reducing the Maintenance Overhead for Integrating Hardware Support into AI Frameworks

Share this:

Recent source codes

Most viewed papers (last 30 days)