pykit

this is a python toolkit for mining chinese text (now) and other data (in future)

visual text mining features (module vistext)

extract keywords based on TFIDF, TextRank(based on https://github.com/fxsjy/jieba)
extract key phrases based on some code from https://github.com/letiantian/TextRank4ZH/tree/master/textrank4zh
generate word cloud based on WordCloud (https://github.com/amueller/word_cloud)
generate word frequencies and co-occurrence network (from https://github.com/ipython/talks/blob/master/parallel/text_analysis.py)
create word2vec model based on gensim
generate dendrogram of keywords based on word vectors
cluster keywords based on kmeans

##License All materials in this repository are licensed CC-BY, and I encourage reuse!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
vis_text		vis_text
.gitignore		.gitignore
README.md		README.md