high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning

AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning

Yuhan Liu, Saurabh Agarwal, Shivaram Venkataraman

University of Wisconsin-Madison

arXiv:2102.01386 [cs.LG], (2 Feb 2021)

BibTeX

Download (PDF)

View

Source

Source codes

Package:

AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning

1623

views

With the rapid adoption of machine learning (ML), a number of domains now use the approach of fine-tuning models pre-trained on a large corpus of data. However, our experiments show that even fine-tuning on models like BERT can take many hours when using GPUs. While prior work proposes limiting the number of layers that are fine-tuned, e.g., freezing all layers but the last layer, we find that such static approaches lead to reduced accuracy. We propose, AutoFreeze, a system that uses an adaptive approach to choose which layers are trained and show how this can accelerate model fine-tuning while preserving accuracy. We also develop mechanisms to enable efficient caching of intermediate activations which can reduce the forward computation time when performing fine-tuning. Our evaluation on fourNLP tasks shows that AutoFreeze, with caching enabled, can improve fine-tuning performance by up to 2.55x.

Tags: Computer science, Deep learning, Machine learning, NLP, nVidia, nVidia A100, Package, Python, Tesla P100

February 7, 2021 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning

Package:

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)