Distributed Negative Sampling for Word Embeddings

Stergios Stergiou; Zygimantas Straznickas; Rolina Wu; Kostas Tsioutsiouliklis

doi:10.1609/aaai.v31i1.10931

Distributed Negative Sampling for Word Embeddings

Authors

Stergios Stergiou Yahoo Research
Zygimantas Straznickas Massachusetts Institute of Technology
Rolina Wu University of Waterloo
Kostas Tsioutsiouliklis Yahoo Research

DOI:

https://doi.org/10.1609/aaai.v31i1.10931

Keywords:

negative sampling, word embeddings, word2vec

Abstract

Word2Vec recently popularized dense vector word representations as fixed-length features for machine learning algorithms and is in widespread use today. In this paper we investigate one of its core components, Negative Sampling, and propose efficient distributed algorithms that allow us to scale to vocabulary sizes of more than 1 billion unique words and corpus sizes of more than 1 trillion words.

Downloads

Published

2017-02-13

How to Cite

Stergiou, S., Straznickas, Z., Wu, R., & Tsioutsiouliklis, K. (2017). Distributed Negative Sampling for Word Embeddings. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1). https://doi.org/10.1609/aaai.v31i1.10931

Download Citation

Issue

Vol. 31 No. 1 (2017): Thirty-First AAAI Conference on Artificial Intelligence

Section

Machine Learning Methods

Distributed Negative Sampling for Word Embeddings

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information