high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » SLATE port to AMD and Intel platforms

SLATE port to AMD and Intel platforms

Ahmad Abdelfattah, Mohammed Al Farhan, Cade Brown, Mark Gates, Dalal Sukkari, Asim YarKhan, Jack Dongarra

Innovative Computing Laboratory

SLATE Working Notes, no. 16, ICL-UT-21-01, 2021

@article{abdelfattah2021slate,

title={SLATE$}$portto${$AMD$}$and${$Intel$}$platforms},

author={Abdelfattah, Ahmad and Al Farhan, Mohammed and Brown, Cade and Gates, Mark},

year={2021},

publisher={Innovative Computing Laboratory, University of Tennessee}

}

Download (PDF)

View

Source

Source codes

Package:

SLATE: Software for Linear Algebra Targeting Exascale project

2111

views

SLATE implements GPU-accelerated linear algebra, relying primarily on vendor-provided GPU BLAS for performance, in particular batched BLAS routines. Initially, SLATE was written using NVIDIA’s CUDA and cuBLAS for GPU acceleration. At the time that the SLATE project was started, it was unclear what GPU technologies would exist for other platforms [1]. Since then, AMD has developed their ROCm platform, which includes the HIP portability layer, and Intel has developed their oneAPI platform based on SYCL, a C++ descendant of OpenCL. These software stacks will be used on upcoming exascale systems: Frontier, built by AMD for Oak Ridge National Laboratory, and Aurora, built by Intel for Argonne National Laboratory. Therefore, to support these new platforms, SLATE needed to be ported to both ROCm and Intel oneMKL.

Tags: Computer science, CUBLAS, CUDA, Linear Algebra, nVidia, OpenCL, Package, SYCL

April 18, 2021 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

SLATE port to AMD and Intel platforms

Package:

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

SLATE port to AMD and Intel platforms

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)