d532: FastText for text representations and text classifiers

Fasttext – Library for efficient text classification and representation learning

Download pre-trained fasttext word vectors

Pre-trained word vectors learned on different sources can be downloaded below:

wiki-news-300d-1M.vec.zip: 1 million word vectors trained on Wikipedia 2017, UMBC webbase corpus and statmt.org news dataset (16B tokens).

wiki-news-300d-1M-subword.vec.zip: 1 million word vectors trained with subword infomation on Wikipedia 2017, UMBC webbase corpus and statmt.org news dataset (16B tokens).

crawl-300d-2M.vec.zip: 2 million word vectors trained on Common Crawl (600B tokens).

These models have state of the art performance on several benchmarks (up to 88% accuracy on the popular word analogy dataset). https://www.facebook.com/groups/1174547215919768/permalink/1631075090266976/

Download pre-trained fasttext word vectors

Share this:

Related