high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Winograd Algorithm for AdderNet

Winograd Algorithm for AdderNet

Wenshuo Li, Hanting Chen, Mingqiang Huang, Xinghao Chen, Chunjing Xu, Yunhe Wang

Noah’s Ark Lab, Huawei Technologies

arXiv:2105.05530 [cs.LG], (12 May 2021)

@misc{li2021winograd,

title={Winograd Algorithm for AdderNet},

author={Wenshuo Li and Hanting Chen and Mingqiang Huang and Xinghao Chen and Chunjing Xu and Yunhe Wang},

year={2021},

eprint={2105.05530},

archivePrefix={arXiv},

primaryClass={cs.LG}

}

Download (PDF)

View

Source

2022

views

Adder neural network (AdderNet) is a new kind of deep model that replaces the original massive multiplications in convolutions by additions while preserving the high performance. Since the hardware complexity of additions is much lower than that of multiplications, the overall energy consumption is thus reduced significantly. To further optimize the hardware overhead of using AdderNet, this paper studies the winograd algorithm, which is a widely used fast algorithm for accelerating convolution and saving the computational costs. Unfortunately, the conventional Winograd algorithm cannot be directly applied to AdderNets since the distributive law in multiplication is not valid for the l1-norm. Therefore, we replace the element-wise multiplication in the Winograd equation by additions and then develop a new set of transform matrixes that can enhance the representation ability of output features to maintain the performance. Moreover, we propose the l2-to-l1 training strategy to mitigate the negative impacts caused by formal inconsistency. Experimental results on both FPGA and benchmarks show that the new method can further reduce the energy consumption without affecting the accuracy of the original AdderNet.

Tags: Algorithms, CNN, Computer science, Deep learning, FPGA, Machine learning, Neural networks, Performance

May 16, 2021 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Winograd Algorithm for AdderNet

Your response

Recent source codes

ParaCodex: A Profiling-Guided Autonomous Coding Agent for Reliable Parallel Code Generation and Translation

SeedFold: Scaling Biomolecular Structure Prediction

Tilus: A Tile-Level GPU Kernel Programming Language

Memory-Efficient Acceleration of Block Low-Rank Foundation Models on Resource Constrained GPUs

CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

BoltzGen:Toward Universal Binder Design

cuPilot: A Strategy-Coordinated Multi-agent Framework for CUDA Kernel Evolution

MATLAB Tensor Core models

TritonForge: Transform PyTorch Operations into Optimized GPU Kernels with LLMs

RLTune: Hybrid Learning and Optimization-Based Dynamic Scheduling for DL Workloads on Heterogeneous GPU Clusters

Most viewed papers (last 30 days)

Winograd Algorithm for AdderNet

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)