high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » General-Purpose Computing on Tensor Processors

General-Purpose Computing on Tensor Processors

Kuan-Chieh Hsu

University of California, Riverside

University of California, Riverside, 2024

BibTeX

Download (PDF)

View

Source

Source codes

Package:

SHMT: The Simultaneous and Heterogenous Multithreading project

1106

views

Modern computer systems have become heterogeneous and consist of many emerging kinds of hardware accelerators as Dennard Scaling discontinues. Also, such domain-specific hardware accelerators fulfill the rapidly growing computing demands for applications including artificial intelligence (AI) and machine learning (ML). Beyond conventional computer components such as central processing units (CPUs) and memory, modern computers typically contain accelerators such as graphic processing units (GPUs), tensor processing units (TPUs), and neural processing units (NPUs). Although accelerators have various programming interfaces and execution models, a group of accelerators are tensor processors that improve system performance for any problem that uses matrix or tensors as input and/or outputs. Despite the differences among the microarchitectural designs of each, tensor processors essentially are hardware accelerators that focus on providing efficient matrix-based computation solutions. In this dissertation, we envision a new programming paradigm that leverages tensor processors for general-purpose computing beyond the original application domains for AI and ML. The framework should contain the following characteristics. First, the programming interface design for a heterogeneous system with tensor processors must be simple, easy to use, and can maintain great compatibility and portability across various systems. Second, the execution model of the framework should intelligently explore and exploit opportunities by using the tensor processors that deliver better performance and extend the spectrum of application domains. Finally, the framework solution must be cost-effective and energy-efficient and be able to accommodate algorithm redesign and transformation that support broader usages. I proposed three research works in response to the envision. First, I proposed GPTPU, an open-source, open-architecture framework that allows users to explore the usage opportunity of tensor processors for general applications. Second, I proposed SHMT, a new programming and execution model that enables simultaneous parallel processing using heterogeneous processing units for the same function. Lastly, I proposed GSLD, a matrix computing library accommodating either dense or sparse matrix inputs that more intelligently uses dense matrix processors and scalar cores.

Tags: Artificial intelligence, Computer science, CUDA, Heterogeneous systems, Machine learning, nVidia, nVidia GeForce RTX 2080, nVidia GeForce RTX 3090, nVidia Jetson Nano, Package, Sparse matrix, Thesis, TPU

October 27, 2024 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

General-Purpose Computing on Tensor Processors

Package:

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

General-Purpose Computing on Tensor Processors

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)