-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrating mcore export #10238
base: main
Are you sure you want to change the base?
Integrating mcore export #10238
Conversation
Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
…10234) Signed-off-by: Hemil Desai <hemild@nvidia.com>
…nd offsets in manifest (#10198) * Add tests for LazyNeMoIterator and fix case with manifest_only=True and offsets in manifest Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Address code review Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix tests Signed-off-by: Piotr Żelasko <petezor@gmail.com> * fix tests Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com>
…ckpoints (#9939) * perfor serialization using relative paths to allow users to move checkpoints after they're saved Signed-off-by: ashors1 <ashors@nvidia.com> * Apply isort and black reformatting Signed-off-by: ashors1 <ashors1@users.noreply.github.com> * remove unused import Signed-off-by: ashors1 <ashors@nvidia.com> * fix artifact load Signed-off-by: ashors1 <ashors@nvidia.com> * fix path artifact Signed-off-by: ashors1 <ashors@nvidia.com> * remove unused import Signed-off-by: ashors1 <ashors@nvidia.com> --------- Signed-off-by: ashors1 <ashors@nvidia.com> Signed-off-by: ashors1 <ashors1@users.noreply.github.com> Co-authored-by: ashors1 <ashors1@users.noreply.github.com>
* Add MemoryProfileCallback Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com> * Apply isort and black reformatting Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com> * Remove reference cycles, save snapshot on specific ranks Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com> * Remove unnecessary imports Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com> * Apply isort and black reformatting Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com> * Update docstring Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com> --------- Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com> Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com> Signed-off-by: Shriya Rishab <69161273+ShriyaPalsamudram@users.noreply.github.com> Co-authored-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com> Co-authored-by: Dong Hyuk Chang <donghyukc@nvidia.com>
…rocessing (#10052) Flow matching generative model with SSL pretraining framework Signed-off-by: Pin-Jui Ku <pku@nvidia.com> Co-authored-by: Kuray107 <Kuray107@users.noreply.github.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
* Move nemotron transformers + tokenizer imports inline to reduce number of required deps Signed-off-by: Marc Romeyn <mromeijn@nvidia.com> * Apply isort and black reformatting Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com> --------- Signed-off-by: Marc Romeyn <mromeijn@nvidia.com> Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com> Co-authored-by: marcromeyn <marcromeyn@users.noreply.github.com>
* Wrap CPU model init with megatron_lazy_init_context Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Cleanup checkpoint-dir if saving fails Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
* [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 124bcff ! Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix bert flags Signed-off-by: Oliver Koenig <okoenig@nvidia.com> --------- Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Oliver Koenig <okoenig@nvidia.com> Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>
Signed-off-by: slyne deng <slyned@nvidia.com> Co-authored-by: slyne deng <slyned@nvidia.com>
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: oliver könig <okoenig@nvidia.com> Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>
* Load model in the target export precision by default Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Enable megatron_amp_O2=true to actually use half-precision Signed-off-by: Jan Lasek <jlasek@nvidia.com> --------- Signed-off-by: Jan Lasek <janek.lasek@gmail.com> Signed-off-by: Jan Lasek <jlasek@nvidia.com>
…n.plugins (#10223) * Add WandbPlugin, NsysPlugin and PreemptionPlugin to nemo.lightning.run.plugins Signed-off-by: Hemil Desai <hemild@nvidia.com> * Apply isort and black reformatting Signed-off-by: hemildesai <hemildesai@users.noreply.github.com> * Remove duplicate Signed-off-by: Hemil Desai <hemild@nvidia.com> * Add entity to wandb logger Signed-off-by: Hemil Desai <hemild@nvidia.com> * Add documentation Signed-off-by: Hemil Desai <hemild@nvidia.com> * Apply isort and black reformatting Signed-off-by: hemildesai <hemildesai@users.noreply.github.com> * Add warning Signed-off-by: Hemil Desai <hemild@nvidia.com> * Apply isort and black reformatting Signed-off-by: hemildesai <hemildesai@users.noreply.github.com> * PR feedback Signed-off-by: Hemil Desai <hemild@nvidia.com> * Apply isort and black reformatting Signed-off-by: hemildesai <hemildesai@users.noreply.github.com> * Add comments Signed-off-by: Hemil Desai <hemild@nvidia.com> * Apply isort and black reformatting Signed-off-by: hemildesai <hemildesai@users.noreply.github.com> --------- Signed-off-by: Hemil Desai <hemild@nvidia.com> Signed-off-by: hemildesai <hemildesai@users.noreply.github.com> Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>
* handle absolute and relative logger directories Signed-off-by: Anna Shors <ashors@nvidia.com> * merge lines Signed-off-by: ashors1 <ashors@nvidia.com> --------- Signed-off-by: Anna Shors <ashors@nvidia.com> Signed-off-by: ashors1 <ashors@nvidia.com>
* Add sdxl notebook Signed-off-by: mingyuanm <mingyuanm@nvidia.com> * Rename Signed-off-by: mingyuanm <mingyuanm@nvidia.com> * final Update SDXL notebook Signed-off-by: mingyuanm <mingyuanm@nvidia.com> --------- Signed-off-by: mingyuanm <mingyuanm@nvidia.com>
Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
jenkins |
Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
jenkins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
jenkins |
We might also want to bump MCore commit here to 45bf4c1821c2a87ca02e8a0377d61097bac92d07 so that it's clear how to use newly integrated code |
Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
[🤖]: Hi @shanmugamr1992 👋, I just wanted to let you know that, you know, a CICD pipeline for this PR just finished successfully ✨ So it might be time to merge this PR or like to get some approvals 🚀 But I'm just a 🤖 so I'll leave it you what to do next. Have a great day! //cc @ko3n1g |
[🤖]: Hi @shanmugamr1992 👋, I just wanted to let you know that, you know, a CICD pipeline for this PR just finished successfully ✨ So it might be time to merge this PR or like to get some approvals 🚀 But I'm just a 🤖 so I'll leave it you what to do next. Have a great day! //cc @ko3n1g |
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Collection: [Note which collection this PR will affect]
Changelog
Usage
# Add a code snippet demonstrating how to use this
GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information