-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use response selector keys for confusion matrix labelling #7423
Use response selector keys for confusion matrix labelling #7423
Conversation
Thanks for submitting a pull request 🚀 @tttthomasssss will take a look at it as soon as possible ✨ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this on 1.10.x
branch 👍 I suggested a few changes and based on them there are a few more places where you will have to make changes. For example, line 412-414(return actual and predicted labels instead of actual and predicted texts), line 1488(use actual and predicted labels instead of actual and predicted texts to compute metrics), line 241(not filter response examples based on empty texts).
All of these changes are actually implemented on master
already if you would like to view them for reference.
rasa/nlu/test.py
Outdated
if isinstance(response_prediction_full_intent, str): | ||
response_prediction_full_intent = response_prediction_full_intent.split( | ||
"/" | ||
)[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest not splitting the predicted retrieval_intent/sub_intent
because you could have same sub_intent
under multiple retrieval intents. So, let's compare for e.g. faq/ask_name
to faq/ask_weather
and not ask_name
to ask_weather
.
rasa/nlu/test.py
Outdated
response_target = example.get("response", "") | ||
response_key = example.get(RESPONSE_KEY_ATTRIBUTE, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following from the above comment, this would then change to example.get_combined_intent_response_key()
rasa/nlu/test.py
Outdated
@@ -62,7 +63,7 @@ | |||
|
|||
ResponseSelectionEvaluationResult = namedtuple( | |||
"ResponseSelectionEvaluationResult", | |||
"intent_target " "response_target " "response_prediction " "message " "confidence", | |||
"intent_target response_key response_target response_prediction_full_intent response_prediction message confidence", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like with this change we can also get rid of intent_target
, response_target
and response_prediction
. Can you scrub those off as well? There will be some more places where you will have to make changes.
@dakshvar22 Thanks for the feedback. I applied your suggestions. Please see commit 021bcb1 |
@dakshvar22 Is anything else required for this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this and addressing all my comments. ✨
@dakshvar22 To my dismay I discovered that my refactoring reintroduced a bug in my original PR. Unfortunately, it was captured by any of the unit the tests. I came across it when running a new response selector evaluation today. Please see this commit for a fix: hotzenklotz@fd037be In short, the default value for a response selector key was wrong. It needs to default to cc @tmbo |
Proposed changes:
response_selection_report.json
uses the same labels leading to better comprehension.Examples
Before
After
Status (please check what you already did):
black
(please check Readme for instructions)