-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid retraining Core model if only templates section is changed #4251
Conversation
@msamogh what is missing to move this PR forward? |
tests/core/conftest.py:239: in trained_model
return train_model(project)
tests/core/conftest.py:232: in train_model
rasa.train(domain, config, training_files, output)
rasa/train.py:40: in train
kwargs=kwargs,
uvloop/loop.pyx:1417: in uvloop.loop.Loop.run_until_complete
???
rasa/train.py:87: in train_async
kwargs,
rasa/train.py:170: in _train_async_internal
kwargs=kwargs,
rasa/train.py:216: in _do_training
kwargs=kwargs,
rasa/train.py:346: in _train_core_with_validated_data
kwargs=kwargs,
rasa/core/train.py:71: in train
agent.persist(output_path, dump_stories, replace_templates_only)
rasa/core/agent.py:799: in persist
self.domain.persist(os.path.join(model_path, DEFAULT_DOMAIN_PATH))
rasa/core/domain.py:638: in persist
utils.dump_obj_as_yaml_to_file(filename, domain_data)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
filename = '/var/folders/87/pnn5hwg11cb8qkgp_q4dm8280000gp/T/tmpl_ucro1v/core/domain.yml'
obj = {'actions': ['utter_cheer_up', 'utter_did_that_help', 'utter_goodbye', 'utter_greet', 'utter_happy'], 'config': {'store_entities_as_slots': True}, 'entities': [], 'forms': [], ...}
def dump_obj_as_yaml_to_file(filename: Union[Text, Path], obj: Dict) -> None:
"""Writes data (python dict) to the filename in yaml repr."""
> with open(str(filename), "w", encoding="utf-8") as output:
E FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/87/pnn5hwg11cb8qkgp_q4dm8280000gp/T/tmpl_ucro1v/core/domain.yml'
rasa/core/utils.py:227: FileNotFoundError I suspect that it could be because a temp directory somewhere is being cleaned up prematurely. |
7dbdfb8
to
bf6c841
Compare
Think I found the cause. Had switched the order of persists. |
Travis job link - https://travis-ci.com/RasaHQ/rasa/builds/125875507 |
@tmbo Looks like good to go |
great work ⭐ As far as I can see, can go into a patch. Will release as part of 1.3.1 |
5e2eec7
to
c1473ba
Compare
bcd5ff7
to
482388b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the idea, but I'd like to avoid adding the template flag to the retrain method. I'd rather directly only call the domain to persist from the method that checks the fingerprints
rasa/model.py
Outdated
target_path = os.path.join(train_path, "core") | ||
retrain_core = not merge_model(old_core, target_path) | ||
|
||
if not nlu_fingerprint_changed(last_fingerprint, new_fingerprint): | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the logic is correct here, as this else will be run if core needs to be retrained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I understood your concern here correctly. Perhaps you can mention a specific combination of values for retrain_core
and retrain_nlg
that you think this will fail for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Tom means: why do we need to merge core
in case retrain_core=True
?
In general I don't like that should_retrain
already does some modifications. I think that's unexpected behavior when I call a function should_xy_be_done
. And I also think that's part of the reason why this seems so complicated (to me 😄 ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring also indicates a totally different behavior of the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the objectives are as follows:
should_retrain.nlg
should return the value specific to whether only the NLG section has changed regardless of the value ofshould_retrain.core
. This is to ensure the returned values make logical sense.- Given that its value should be independent of
should_retrain.core
, we also want to check whether it's possible to merge the directories, because in the following case when Core needn't be retrained and merging directories is failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But have updated the condition now to make it cleaner and less confusing.
2. Short-circuit flag before reaching the actual train method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functionality seems great. However, I think the check for the NLG section is the cherry on top of some technical debt we are having here, and we should untangle some of the things first to make it easier understandable / maintainable.
rasa/model.py
Outdated
target_path = os.path.join(train_path, "core") | ||
retrain_core = not merge_model(old_core, target_path) | ||
|
||
if not nlu_fingerprint_changed(last_fingerprint, new_fingerprint): | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Tom means: why do we need to merge core
in case retrain_core=True
?
In general I don't like that should_retrain
already does some modifications. I think that's unexpected behavior when I call a function should_xy_be_done
. And I also think that's part of the reason why this seems so complicated (to me 😄 ).
rasa/model.py
Outdated
target_path = os.path.join(train_path, "core") | ||
retrain_core = not merge_model(old_core, target_path) | ||
|
||
if not nlu_fingerprint_changed(last_fingerprint, new_fingerprint): | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring also indicates a totally different behavior of the function
Co-Authored-By: Tobias Wochinger <t.wochinger@rasa.com>
Co-Authored-By: Tobias Wochinger <t.wochinger@rasa.com>
@tabergma I took this over from Amogh. I actually think this should not go into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍 Added a few minor comments.
Yes, let's merge into master.
rasa/model.py
Outdated
retrain.core = core_merge_failed | ||
|
||
if not retrain.nlg: | ||
retrain.nlg = core_merge_failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment here why we overwrite retrain.nlg
with core_merge_failed
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also changed the code to make it a bit easier to read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 good to go!
Co-Authored-By: Tanja <tabergma@gmail.com>
…anular-fingerprint
Proposed changes:
Status (please check what you already did):
black
(please check Readme for instructions)