Skip to content
This repository has been archived by the owner on Jun 14, 2018. It is now read-only.

Isolate single language processing library #14

Open
msarahan opened this issue Sep 16, 2015 · 4 comments
Open

Isolate single language processing library #14

msarahan opened this issue Sep 16, 2015 · 4 comments

Comments

@msarahan
Copy link
Contributor

  • Gensim
  • NLTK

Reimplement internally

@msarahan msarahan added this to the 0.3 Release milestone Oct 1, 2015
@msarahan
Copy link
Contributor Author

msarahan commented Nov 6, 2015

Current usages:

Gensim: LDA (was used also for stopwords, but we don't really need that)
NLTK: collocation tokenizer
TextBlob: entities tokenizer (depends in turn on NLTK)

@gpfreitas
Copy link

@msarahan , did you guys consider spacy? If so, any thoughts on it?

Spacy has a big problem though (at least for now): it's for English text only. :/

@msarahan
Copy link
Contributor Author

No, we didn't. Sorry, other than being aware of it by name, I don't know anything about it.

@gpfreitas
Copy link

No worries. I am trying to find out more about this library. Very few people seem to know it well. Thanks, Mike.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants