Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max_iters as optional arg in Engine run #1300

Closed
vfdev-5 opened this issue Sep 17, 2020 · 8 comments · Fixed by #1381
Closed

max_iters as optional arg in Engine run #1300

vfdev-5 opened this issue Sep 17, 2020 · 8 comments · Fixed by #1381

Comments

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Sep 17, 2020

🚀 Feature

Idea is to cover the following usage:

trainer.run(data, max_iters=1234)
# here we still have epoch length as len(data) if defined

which can be roughly done as

max_iters = 1234
trainer.run(data, epoch_length=max_iters)
# here we run 1 epoch of size max_iters

or as something like

max_iters = 1234
epoch_length = ...
max_epochs = max_iters // epoch_length + 1

@trainer.on(Events.ITERATION_COMPLETED(once=max_iters))
def stop():
    trainer.terminate()

trainer.run(data, max_epochs=max_epochs)

Argument max_iters is mutually exclusive with max_epochs.

@sdesrozis
Copy link
Contributor

@vfdev-5 I really enjoy this ! It could help a lot to design training !

@vfdev-5
Copy link
Collaborator Author

vfdev-5 commented Oct 6, 2020

@sdesrozis according to your time schedule, let's make this issue as Hacktorberfest and if it is not yet solved, you can try to give a shot.

@sdesrozis
Copy link
Contributor

Ok !

@thescripted
Copy link
Contributor

I'd like to spend a few days & try to resolve this issue. Perhaps it's a quick fix but the extra time will help me get familiarized with more of the codebase as well

@vfdev-5
Copy link
Collaborator Author

vfdev-5 commented Oct 7, 2020

@thescripted sure, I can assign it to you if you wish.

Perhaps it's a quick fix but the extra time will help me get familiarized with more of the codebase as well

I'm not sure if it can be a quick fix, there are several things to add, hope you will have time and motivation to discuss and code :)
Approximative plan is :

  1. add max_iters to State (but I'm not quite sure about that)
  2. compute max_epochs somehow:
    a) if we have epoch_length not None => max_epochs = ceil max_iters / epoch_length
    b) otherwise, we have to invent something ...
  3. add additional checks to exit the run once we achieved max_iters:
    if self.state.epoch_length is not None and iter_counter == self.state.epoch_length:

    and
    if self.should_terminate:

@thescripted
Copy link
Contributor

thescripted commented Oct 8, 2020

Got it, yeah I'd be happy to put in the time and motivation to work through this, and I'll be quick to notify and un-assign myself if I feel incapable.

@sdesrozis
Copy link
Contributor

Side remark : help to solve #1357 and maybe #1371 ?

@vfdev-5
Copy link
Collaborator Author

vfdev-5 commented Oct 8, 2020

Side remark : help to solve #1357 and maybe #1371 ?

@sdesrozis I think those issues should be fixed in other PRs

@vfdev-5 vfdev-5 added PyDataGlobal PyData Global 2020 Sprint and removed Hacktoberfest PyDataGlobal PyData Global 2020 Sprint labels Oct 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants