Skip to content

Commit

Permalink
Merge pull request #39 from cardmagic/add-remove_category
Browse files Browse the repository at this point in the history
Add remove_category functionality to Bayes classifier
  • Loading branch information
cardmagic authored Jul 31, 2024
2 parents a537321 + ea32aea commit 17143ab
Show file tree
Hide file tree
Showing 3 changed files with 71 additions and 1 deletion.
2 changes: 1 addition & 1 deletion classifier.gemspec
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Gem::Specification.new do |s|
s.name = 'classifier'
s.version = '1.4.0'
s.version = '1.4.1'
s.summary = 'A general classifier module to allow Bayesian and other types of classifications.'
s.description = 'A general classifier module to allow Bayesian and other types of classifications.'
s.author = 'Lucas Carlson'
Expand Down
18 changes: 18 additions & 0 deletions lib/classifier/bayes.rb
Original file line number Diff line number Diff line change
Expand Up @@ -139,5 +139,23 @@ def add_category(category)
end

alias append_category add_category

#
# Allows you to remove categories from the classifier.
# For example:
# b.remove_category "Spam"
#
# WARNING: Removing categories from a trained classifier will
# result in the loss of all training data for that category.
# Make sure you really want to do this before calling this method.
def remove_category(category)
category = category.prepare_category_name
raise StandardError, "No such category: #{category}" unless @categories.key?(category)

@categories.delete(category)
@category_counts.delete(category)
@category_word_count.delete(category)
@total_words -= @category_word_count[category].to_i

This comment has been minimized.

Copy link
@dpetruha

dpetruha Jul 31, 2024

@cardmagic , shouldn't we subtract words count before we call @category_word_count.delete(category)?
I.e. at this point (line 158) @category_word_count does not include category as it was dropped at line 157

This comment has been minimized.

Copy link
@cardmagic

cardmagic Jul 31, 2024

Author Owner

@dpetruha thank you for the good eye on that, fixed in release 1.4.2 here: #40

end
end
end
52 changes: 52 additions & 0 deletions test/bayes/bayesian_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,56 @@ def test_safari_animals
assert_equal 'Lion', bayes.classify('lion')
assert_equal 'Elephant', bayes.classify('elephant')
end

def test_remove_category
@classifier.train_interesting 'This is interesting content'
@classifier.train_uninteresting 'This is uninteresting content'

assert_equal %w[Interesting Uninteresting].sort, @classifier.categories.sort

@classifier.remove_category 'Uninteresting'

assert_equal ['Interesting'], @classifier.categories
end

def test_remove_nonexistent_category
assert_raises(StandardError) do
@classifier.remove_category 'NonexistentCategory'
end
end

def test_remove_category_affects_classification
@classifier.train_interesting 'This is interesting content'
@classifier.train_uninteresting 'This is uninteresting content'

assert_equal 'Uninteresting', @classifier.classify('This is uninteresting')

@classifier.remove_category 'Uninteresting'

assert_equal 'Interesting', @classifier.classify('This is uninteresting')
end

def test_remove_all_categories
@classifier.remove_category 'Interesting'
@classifier.remove_category 'Uninteresting'

assert_empty @classifier.categories
end

def test_remove_and_add_category
@classifier.remove_category 'Uninteresting'
@classifier.add_category 'Neutral'

assert_equal %w[Interesting Neutral].sort, @classifier.categories.sort
end

def test_remove_category_preserves_other_category_data
@classifier.train_interesting 'This is interesting content'
@classifier.train_uninteresting 'This is uninteresting content'

interesting_classification = @classifier.classify('This is interesting')
@classifier.remove_category 'Uninteresting'

assert_equal interesting_classification, @classifier.classify('This is interesting')
end
end

0 comments on commit 17143ab

Please sign in to comment.