19200

word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement

Aliakbar Panahi, Seyran Saeedi, Tom Arodz
Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
arXiv:1911.04975 [cs.LG], (12 Nov 2019)

@misc{panahi2019word2ket,

   title={word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement},

   author={Aliakbar Panahi and Seyran Saeedi and Tom Arodz},

   year={2019},

   eprint={1911.04975},

   archivePrefix={arXiv},

   primaryClass={cs.LG}

}

Download Download (PDF)   View View   Source Source   

1362

views

Deep learning natural language processing models often use vector word embeddings, such as word2vec or GloVe, to represent words. A discrete sequence of words can be much more easily integrated with downstream neural layers if it is represented as a sequence of continuous vectors. Also, semantic relationships between words, learned from a text corpus, can be encoded in the relative configurations of the embedding vectors. However, storing and accessing embedding vectors for all words in a dictionary requires large amount of space, and may stain systems with limited GPU memory. Here, we used approaches inspired by quantum computing to propose two related methods, {em word2ket} and {em word2ketXS}, for storing word embedding matrix during training and inference in a highly efficient way. Our approach achieves a hundred-fold or more reduction in the space required to store the embeddings with almost no relative drop in accuracy in practical natural language processing tasks.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: