天然语言处理工具:中文 word2vec 开源项目,教程,数据集

word2vec

word2vec/glove/swivel binary file on chinese corpusphp

word2vec: https://code.google.com/p/word2vec/git

glove: http://nlp.stanford.edu/projects/glove/github

swivel: https://github.com/tensorflow/models/tree/master/swivel
http://arxiv.org/abs/1602.02215机器学习

开源项目

wordvectors学习

Pre-trained word vectors of 30+ languagesgoogle

https://github.com/Kyubyong/wordvectors.net

chinese-word2veccode

word2vec/glove/swivel binary file on chinese corpusxml

https://github.com/to-shimo/chinese-word2vecblog

教程

维基百科语料中的词语类似度探索

http://www.52nlp.cn/tag/gensim

利用word2vec对关键词进行聚类

http://blog.csdn.net/zhaoxinfan/article/details/11069485

Training Word2Vec Model on English Wikipedia by Gensim

http://textminingonline.com/training-word2vec-model-on-english-wikipedia-by-gensim

数据集

wiki

https://dumps.wikimedia.org/zhwiki/latest/zhwiki-latest-pages-articles.xml.bz2

sogou

http://www.sogou.com/labs/resource/list_news.php

更多机器学习教程:http://www.tensorflownews.com/

相关文章
相关标签/搜索