Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Fix non-tensor scalar result aggregation #3540

Closed
wants to merge 6 commits into from
Closed

WIP: Fix non-tensor scalar result aggregation #3540

wants to merge 6 commits into from

Conversation

s-rog
Copy link
Contributor

@s-rog s-rog commented Sep 18, 2020

What does this PR do?

Fixes #3276
Resolves #2143

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

Todo

ddp sync for converted tensors?
convert in log or __set_meta?
write test(s)
fix tests

@s-rog
Copy link
Contributor Author

s-rog commented Sep 18, 2020

@justusschock any thoughts on the first two entries in my todo above?

@justusschock
Copy link
Member

@s-rog I'd say convert ASAP and maybe the ddp_sync should come automatically with that

@s-rog
Copy link
Contributor Author

s-rog commented Sep 18, 2020

@justusschock gotcha, I just put it there for now to maintain current (no) syncing behavior, update to come

@codecov
Copy link

codecov bot commented Sep 18, 2020

Codecov Report

Merging #3540 into master will decrease coverage by 0%.
The diff coverage is 67%.

@@          Coverage Diff           @@
##           master   #3540   +/-   ##
======================================
- Coverage      91%     91%   -0%     
======================================
  Files         109     109           
  Lines        8043    8045    +2     
======================================
+ Hits         7303    7304    +1     
- Misses        740     741    +1     

@Borda Borda added the bug Something isn't working label Sep 18, 2020
@s-rog s-rog marked this pull request as ready for review September 21, 2020 00:44
@mergify mergify bot requested a review from a team September 21, 2020 00:45
@s-rog s-rog changed the title Draft: Fix non-tensor scalar result aggregation WIP: Fix non-tensor scalar result aggregation Sep 21, 2020
@s-rog
Copy link
Contributor Author

s-rog commented Sep 21, 2020

I'm trying to figure out why the code path is different for results between tests and examples, as the changes fixes the issue correctly in examples but seem to have no effect in tests (no casting occurs).

Edit:
I'm occupied by a project atm, will continue working on this as soon as I can.

@edenlightning
Copy link
Contributor

@williamFalcon any advice?

@Borda Borda self-assigned this Oct 5, 2020
@Borda
Copy link
Member

Borda commented Oct 5, 2020

fixed in #3855

@Borda Borda closed this Oct 5, 2020
@Borda Borda added this to the 0.10.0 milestone Oct 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Logging non-tensor scalar with result breaks subsequent epoch aggregation Fix checkpoint warning for floats
4 participants