high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Supporting mixed-datatype matrix multiplication within the BLIS framework

Supporting mixed-datatype matrix multiplication within the BLIS framework

Field G. Van Zee, Devangi N. Parikh, Robert A. van de Geijn

Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX

arXiv:1901.06015 [cs.MS], (17 Jan 2019)

BibTeX

Download (PDF)

View

Source

Source codes

Package:

BLIS: BLAS-like Library Instantiation Software Framework

1824

views

We approach the problem of implementing mixed-datatype support within the general matrix multiplication (GEMM) operation of the BLIS framework, whereby each matrix operand A, B, and C may be stored as single- or double-precision real or complex values. Another factor of complexity, whereby the computation is allowed to take place in a precision different from the storage precisions of either A or B, is also included in the discussion. We first break the problem into mostly orthogonal dimensions, considering the mixing of domains separately from mixing precisions. Support for all combinations of matrix operands stored in either the real or complex domain is mapped out by enumerating the cases and describing an implementation approach for each. Supporting all combinations of storage and computation precisions is handled by typecasting the matrices at key stages of the computation—during packing and/or accumulation, as needed. Several optional optimizations are also documented. Performance results gathered on a 56-core Marvell ThunderX2 and a 52-core Intel Xeon Platinum demonstrate that high performance is mostly preserved, with modest slowdowns incurred from unavoidable typecast instructions. The mixed-datatype implementation confirms that combinatoric intractability is avoided, with the framework relying on only two assembly microkernels to implement 128 datatype combinations.

Tags: Computer science, Matrix multiplication, OpenMP, Package

January 27, 2019 by hgpu

Rating: 2.0/5. From 1 vote.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Supporting mixed-datatype matrix multiplication within the BLIS framework

Package:

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

Supporting mixed-datatype matrix multiplication within the BLIS framework

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)