Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations to tfidf backend training #335

Merged
merged 10 commits into from
Oct 4, 2019
Merged

Commits on Sep 26, 2019

  1. Configuration menu
    Copy the full SHA
    a4687d4 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2019

  1. Configuration menu
    Copy the full SHA
    c01c88d View commit details
    Browse the repository at this point in the history
  2. pre-transform document corpus to subject corpus before vectorizing - for

    reasons I don't quite understand this brings a small performance boost
    osma committed Oct 3, 2019
    Configuration menu
    Copy the full SHA
    466dd5f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ea6dbeb View commit details
    Browse the repository at this point in the history

Commits on Oct 4, 2019

  1. Configuration menu
    Copy the full SHA
    be2c44c View commit details
    Browse the repository at this point in the history
  2. Move subject vectorizer handling inside tfidf backend, as no other ba…

    …ckend
    
    needs it and it is unlikely other backends will need it in the future
    osma committed Oct 4, 2019
    Configuration menu
    Copy the full SHA
    65033fa View commit details
    Browse the repository at this point in the history
  3. Use TfidfVectorizer.fit_transform as it is more efficient than separa…

    …te fit
    
    and transform steps
    osma committed Oct 4, 2019
    Configuration menu
    Copy the full SHA
    b81f0c7 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    6f49b21 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    b6dbbcd View commit details
    Browse the repository at this point in the history
  6. Split up TFIDFBackend.train

    osma committed Oct 4, 2019
    Configuration menu
    Copy the full SHA
    0c6ee3b View commit details
    Browse the repository at this point in the history