https://hgpu.org/?p=16781
Parallelizing Word2Vec in Multi-Core and Many-Core Architectures