Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a lookup entity extractor #5957

Closed
tabergma opened this issue Jun 4, 2020 · 4 comments · Fixed by #6214
Closed

Add a lookup entity extractor #5957

tabergma opened this issue Jun 4, 2020 · 4 comments · Fixed by #6214
Assignees
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR

Comments

@tabergma
Copy link
Contributor

tabergma commented Jun 4, 2020

Description of Problem:
Lookup tables are used to create additional features for our entity extractors. In order for the features (aka "the word is part of lookup table xyz, so it is likely that is has the entity xyz") to work, you need to have some examples in your training data with entires from the lookup table.

However, many users think that all entries in the lookup table are extracted as an entity right away. So, as soon as an entry from the lookup table is detected it is extracted as an entity. Many users on the forum struggling with this.

Overview of the Solution:

  • Create a new component LookupEntityExtractor that takes a lookup table. It will extract exact matches of entries in the lookup table and marks them as certain entity. (What about plural/singular/etc. of words? Should we also consider those?)
  • We should update the docs to clarify the difference between lookup tables and lookup entity extractor.
@tabergma tabergma added type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Jun 4, 2020
@erohmensing
Copy link
Contributor

@tabergma this is an interesting idea -- i definitely think users are confused about this. Do you think a RegexEntityExtractor would similarly be helpful? I think some users have implemented it themselves already

@tabergma
Copy link
Contributor Author

tabergma commented Jun 8, 2020

@erohmensing Good point 👍 RegexFeaturizer could cause a similar issue, might make sense to also add an extractor for it.

@koaning
Copy link
Contributor

koaning commented Jul 6, 2020

@JiteshGaikwad if case you're interested, we'd be interested in a PR relating to your work here.

@ghost ghost mentioned this issue Jul 10, 2020
4 tasks
@ghost
Copy link

ghost commented Jul 10, 2020

hey @koaning @tabergma @erohmensing I have submitted the PR

@tabergma tabergma mentioned this issue Jul 15, 2020
4 tasks
@tabergma tabergma self-assigned this Jul 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants