-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly calibrate confidence values for non-ML policies #7969
Comments
Exalate commented: dakshvar22 commented: From discussions we decided to do the following - We create a distinction between policies where confidence calculation is part of the algorithm(like In addition, we move the logic of fallback action prediction outside of With this refactor in place we plan to follow these steps to pick the best action prediction -
This removes the need for ranking policy predictions from rule-based policies based on confidences. The only exception to the steps above is when |
Exalate commented: dakshvar22 commented: This is no longer a bug for releases post |
Exalate commented: alopez commented: @dakshvar22 is there a definition of done for this? It seems like it would really benefit from having policy evaluation in place first. |
Exalate commented: dakshvar22 commented: What kind of policy evaluation do you mean? Anything other than what |
Exalate commented: dakshvar22 commented: The issue should be considered done when this is the state of how action is picked in the policy ensemble. |
Exalate commented: alopez commented: I mean an evaluation which gives more insight into how policies in the ensemble interact, one of the directions for measuring success. |
Exalate commented: dakshvar22 commented: I don't have a hard opinion but I don't see these related. This issue's purpose was to avoid the usage of "confidences" for rule-based policies in our code so that the code reflects our understanding of how the ensemble works "currently". Once that is done, trying different confidence measures (like unbounded dot product similarities) will become possible. |
Exalate commented: dakshvar22 commented: I just mean that it's not necessary for the two to be linked. The upcoming model regression tests for core is a good enough evaluation for it. |
➤ Maxime Verger commented: 💡 Heads up! We're moving issues to Jira: https://rasa-open-source.atlassian.net/browse/OSS. From now on, this Jira board is the place where you can browse (without an account) and create issues (you'll need a free Jira account for that). This GitHub issue has already been migrated to Jira and will be closed on January 9th, 2023. Do not forget to subscribe to the corresponding Jira issue! ➡️ More information in the forum: https://forum.rasa.com/t/migration-of-rasa-oss-issues-to-jira/56569. |
#7616 introduced a bug with
model_confidence
set tocosine
&inner
inTEDPolicy
. Since the confidences are not guranteed to stay in the range of<span class="error">[0,1]</span>
and other non ML policies likeFormPolicy
,MappingPolicy
, etc. still predict a confidence value in the range<span class="error">[0,1]</span>
, it messes up the logic of picking the best prediction from different policies. We see two solutions:Changing the confidence values of non-ML policies to be either
+infinity
or-infinity
for a match v/s no match.Changing the confidence values of non-ML policies based on
model_confidence
set inTEDPolicy
. So<span class="error">[0,1]</span>
forsoftmax
,<span class="error">[-1,1]</span>
forcosine
,<span class="error">[-infinity, infinity]</span>
forinner
.Technically, option (1) seems better because it keeps things simple and logical too (confidences should not have a finite value for deterministic policies) but then users may see potentially confusing debug messages like
DEBUG rasa.core.processor - Predicted next action 'action_listen' with confidence +infinity.
But+infinity
could be replaced with something more logical?The text was updated successfully, but these errors were encountered: