Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation for training to be accurate #425

Closed
TheKanter opened this issue Nov 1, 2020 · 6 comments
Closed

Update documentation for training to be accurate #425

TheKanter opened this issue Nov 1, 2020 · 6 comments
Labels
general applies to all benchmarks

Comments

@TheKanter
Copy link
Contributor

TheKanter commented Nov 1, 2020

https://github.com/mlperf/training/blob/master/README.md seems stale.

Should we list the current training benchmarks/datasets/accuracy?

@johntran-nv johntran-nv added the general applies to all benchmarks label Nov 8, 2022
@johntran-nv
Copy link
Contributor

I agree that this README is not useful at the moment. I think most of the useful information is in the training_policies repo, which begs the question: why does that need to be a separate repo? At minimum, we should point people to the training rules and the contributing guidelines. But maybe we could consider merging the repos as well? Does anyone have history/context on why they need to be separate?

@TheKanter
Copy link
Contributor Author

@petermattson Originally set it up so that:

  1. Training rules
  2. submission rules
  3. Training code

Were all separate.

The submission rules are used by other benchmarks (E.g., inference, HPC).

The training rules are used by other benchmarks (e.g., HPC).

So we have this complicated inheritance scheme that makes things somewhat complicated and hard to understand.

Additionally, it is difficult in GitHub to enforce cross-repo checks (e.g., if we wanted a checker that would ensure training code and rules are consistent).

I think it is possible to revisit, but this is definitely something that would impact all benchmarks and require a big refactoring. I think it could also significantly enhance understandability.

I understand the idea that having a single place to change things is attractive, but that conceptually favors writes (change rules) over reads (understand rules).

@petermattson
Copy link
Contributor

petermattson commented Nov 28, 2022 via email

@peladodigital
Copy link

In an effort to clean up the git repo so we can maintain it better going forward, the MLPerf Training working group is closing out issues older than 2 years, since much has changed in the benchmark suite. If you think this issue is still relevant, please feel free to reopen. Even better, please come to the working group meeting to discuss your issue

@TheKanter
Copy link
Contributor Author

This needs to be fixed. Please have @johntran-nv or @erichan1 put on agenda.

@hiwotadese
Copy link
Contributor

Closing this because the readme right now list current benchmark and dataset. If there is something specific that is not listed in the readme create a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
general applies to all benchmarks
Projects
None yet
Development

No branches or pull requests

6 participants