change BagOfWordsTransformer to CountTransformer #20

pazzo83 · 2022-02-07T17:14:53Z

Changing the name of this transformer for more clarity. Essentially, all three of the transformers we have right now are based on the "bag of words" concept (TF-IDF and BM25 do additional weighting, but they are derived from the document-term matrix - DTM - which is just a count of each word in each document). Thus, one of the more basic forms of this is just the raw DTM which we can call the CountTransformer (in sklearn this is the CountVectorizer).

I think this would technically be a breaking change since we are changing the names of one of the models.

ablaom

@pazzo83 Thanks for your continued support of this package.

This change makes sense. I agree it should be marked as breaking.

Could you please open an issue at MLJModels to update the registry after this is merged?
And an issue at MLJ to update the "list of models" in docs, would be good too.

change bagofwords transformer to count transformer

62e04d3

pazzo83 requested review from ablaom and storopoli February 7, 2022 17:14

ablaom approved these changes Feb 7, 2022

View reviewed changes

pazzo83 merged commit 43d9d6d into dev Feb 7, 2022

This was referenced Feb 8, 2022

For a 0.2.0 release #21

Merged

Issue to trigger new releases #14

Closed

JuliaRegistrator mentioned this pull request Feb 8, 2022

New version: MLJText v0.2.0 JuliaRegistries/General#54155

Merged

This was referenced Feb 8, 2022

Update registry for renamed MLJText model JuliaAI/MLJModels.jl#430

Closed

Update list of models in docs for MLJText model change JuliaAI/MLJ.jl#900

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change BagOfWordsTransformer to CountTransformer #20

change BagOfWordsTransformer to CountTransformer #20

pazzo83 commented Feb 7, 2022

ablaom left a comment •

edited

Loading

change BagOfWordsTransformer to CountTransformer #20

change BagOfWordsTransformer to CountTransformer #20

Conversation

pazzo83 commented Feb 7, 2022

ablaom left a comment • edited Loading

Choose a reason for hiding this comment

ablaom left a comment •

edited

Loading