今天要总结的是 Word Cloud 最后一个部分了,用 Matlab 来建立 word cloud。Matlab R2018b 已经提供 wordcloud 函数能够直接生成词云了。html
1) 准备文本。git
很少说了,懒人继续用上次那个 Word Cloud History.txt 的文本吧。github
2) 读取并清洗数据文本。编程
%read txt as a string text = string(fileread('C:\Users\yuki\Desktop\WordCloudHistory.txt')); %delete puchuation punctuationCharacters = ["." "?" "!" "," ";" ":"]; text = replace(text,punctuationCharacters," "); %convert a string to array words = split(join(text)); %delete the words has less than 5 characters, which are problely stop words words(strlength(words)<5) = []; %change all words to lowercase words = lower(words);
3) 计算词频并生成数组。数组
%calculate the frequencies for every word [numOccurrences,uniqueWords] = histcounts(categorical(words));
4) 生成 word cloud。less
figure %set properties for word cloud wordcloud(uniqueWords,numOccurrences,'Shape', "rectangle", 'MaxDisplayWords', 200); title("Word Cloud History")
1) Matlab 也有插件能够直接生成词云,操做简单,不用编程,哈哈。函数
2) 既然已经说了各类能够建立词云的方法,那么就顺便总结一下什么方法好用方便不花钱。插件
Tool | Easy Use | Free | Need Script | |||
---|---|---|---|---|---|---|
Python | Clear document, powerful text mining library | Yes | Yes | |||
JavaScript | Need to extract array by own, and need to find a way to save the image | Yes | Yes | |||
R | Clear document, powerful text mining library | Yes | Yes | |||
Matlab | Clear document, interactive interface | No | Optional |
download herecode