-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter out pre-trained entities before CRF entity training #898
Conversation
rasa_nlu/extractors/__init__.py
Outdated
extractor = ent.get("extractor") | ||
if not extractor or extractor == self.name: | ||
entities.append(ent) | ||
message.set("entities", entities) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will work: This will remove all entities from the message that do not belong to the crf extractor, which is fine. The issue is, that this is modifying the message object directly, which will lead to the entities being removed for every component that comes after the crf component. So if there is any other component that needs some entity annotations after the crf, the entities will be missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point 👍
rasa_nlu/extractors/__init__.py
Outdated
@@ -25,3 +26,20 @@ def add_processor_name(self, entity): | |||
else: | |||
entity["processors"] = [self.name] | |||
return entity | |||
|
|||
def filter_trainable_entities(self, entity_examples): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment
Nice work in implementing that filtering, thanks Rick 👍 |
We should reuse an existing context logger if in test context. This will allow test to setup act with a null logger to assert log messages. Co-authored-by: Markus Wolf <markus.wolf@new-work.se> Co-authored-by: Markus Wolf <markus.wolf@new-work.se>
Proposed changes:
Extractor
objects have the option to filter out pretrained entity examplesCRFEntityExtractor
Status (please check what you already did):