Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating mcore export #10238

Merged
merged 47 commits into from
Oct 17, 2024
Merged

Integrating mcore export #10238

merged 47 commits into from
Oct 17, 2024

Commits on Aug 23, 2024

  1. Integrating mcore export

    Shanmugam Ramasamy committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    7e7eb7f View commit details
    Browse the repository at this point in the history
  2. Integrating mcore export

    Shanmugam Ramasamy committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    d6351bb View commit details
    Browse the repository at this point in the history
  3. Apply isort and black reformatting

    Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
    shanmugamr1992 committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    996ea05 View commit details
    Browse the repository at this point in the history

Commits on Aug 27, 2024

  1. Move trt imports in nemo.collections.llm inside respective functions (#…

    …10234)
    
    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    hemildesai authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    7c0584a View commit details
    Browse the repository at this point in the history
  2. Add tests for LazyNeMoIterator and fix case with metadata_only=True a…

    …nd offsets in manifest (#10198)
    
    * Add tests for LazyNeMoIterator and fix case with manifest_only=True and offsets in manifest
    
    Signed-off-by: Piotr Żelasko <petezor@gmail.com>
    
    * Address code review
    
    Signed-off-by: Piotr Żelasko <petezor@gmail.com>
    
    * fix tests
    
    Signed-off-by: Piotr Żelasko <petezor@gmail.com>
    
    * fix tests
    
    Signed-off-by: Piotr Żelasko <petezor@gmail.com>
    
    ---------
    
    Signed-off-by: Piotr Żelasko <petezor@gmail.com>
    pzelasko authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    c34d29a View commit details
    Browse the repository at this point in the history
  3. [NeMo-UX] Fix a serialization bug that prevents users from moving che…

    …ckpoints (#9939)
    
    * perfor serialization using relative paths to allow users to move checkpoints after they're saved
    
    Signed-off-by: ashors1 <ashors@nvidia.com>
    
    * Apply isort and black reformatting
    
    Signed-off-by: ashors1 <ashors1@users.noreply.github.com>
    
    * remove unused import
    
    Signed-off-by: ashors1 <ashors@nvidia.com>
    
    * fix artifact load
    
    Signed-off-by: ashors1 <ashors@nvidia.com>
    
    * fix path artifact
    
    Signed-off-by: ashors1 <ashors@nvidia.com>
    
    * remove unused import
    
    Signed-off-by: ashors1 <ashors@nvidia.com>
    
    ---------
    
    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: ashors1 <ashors1@users.noreply.github.com>
    Co-authored-by: ashors1 <ashors1@users.noreply.github.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    3aa1e5c View commit details
    Browse the repository at this point in the history
  4. Add MemoryProfileCallback (#10166)

    * Add MemoryProfileCallback
    
    Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>
    
    * Apply isort and black reformatting
    
    Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>
    
    * Remove reference cycles, save snapshot on specific ranks
    
    Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>
    
    * Remove unnecessary imports
    
    Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>
    
    * Apply isort and black reformatting
    
    Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>
    
    * Update docstring
    
    Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>
    
    ---------
    
    Signed-off-by: Shriya Palsamudram <spalsamudram@nvidia.com>
    Signed-off-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>
    Signed-off-by: Shriya Rishab <69161273+ShriyaPalsamudram@users.noreply.github.com>
    Co-authored-by: ShriyaPalsamudram <ShriyaPalsamudram@users.noreply.github.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    57de288 View commit details
    Browse the repository at this point in the history
  5. Lower bound transformers to support nemotron (#10240)

    Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
    Co-authored-by: Dong Hyuk Chang <donghyukc@nvidia.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    9214a4e View commit details
    Browse the repository at this point in the history
  6. [Audio] SSL Pretraining framework for flow-matching model for audio p…

    …rocessing (#10052)
    
    Flow matching generative model with SSL pretraining framework
    
    Signed-off-by: Pin-Jui Ku <pku@nvidia.com>
    Co-authored-by: Kuray107 <Kuray107@users.noreply.github.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    c690c4f View commit details
    Browse the repository at this point in the history
  7. Revert torchrun fix for model import (#10251)

    Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
    akoumpa authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    04ca831 View commit details
    Browse the repository at this point in the history
  8. [NeMo-UX[ Move nemotron imports inline (#10255)

    * Move nemotron transformers + tokenizer imports inline to reduce number of required deps
    
    Signed-off-by: Marc Romeyn <mromeijn@nvidia.com>
    
    * Apply isort and black reformatting
    
    Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com>
    
    ---------
    
    Signed-off-by: Marc Romeyn <mromeijn@nvidia.com>
    Signed-off-by: marcromeyn <marcromeyn@users.noreply.github.com>
    Co-authored-by: marcromeyn <marcromeyn@users.noreply.github.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    7a8c0e8 View commit details
    Browse the repository at this point in the history
  9. Wrap CPU model init with megatron_lazy_init_context (#10219)

    * Wrap CPU model init with megatron_lazy_init_context
    
    Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
    
    * Cleanup checkpoint-dir if saving fails
    
    Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
    
    * Apply isort and black reformatting
    
    Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
    
    ---------
    
    Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
    Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
    Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    ac5cb06 View commit details
    Browse the repository at this point in the history
  10. Bump Dockerfile.ci (2024-08-22) (#10227)

    * [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 124bcff !
    
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    
    * fix bert flags
    
    Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
    
    ---------
    
    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
    Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    076f9ea View commit details
    Browse the repository at this point in the history
  11. salm export trtllm (#10245)

    Signed-off-by: slyne deng <slyned@nvidia.com>
    Co-authored-by: slyne deng <slyned@nvidia.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    5964387 View commit details
    Browse the repository at this point in the history
  12. [🤠]: Howdy folks, let's bump Dockerfile.ci to ef85bc9 ! (#10250)

    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    8524596 View commit details
    Browse the repository at this point in the history
  13. [🤠]: Howdy folks, let's bump Dockerfile.ci to 01ca03f ! (#10266)

    Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Signed-off-by: oliver könig <okoenig@nvidia.com>
    Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    f1f145a View commit details
    Browse the repository at this point in the history
  14. Load model in the target export precision by default in PTQ (#10267)

    * Load model in the target export precision by default
    
    Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
    
    * Enable megatron_amp_O2=true to actually use half-precision
    
    Signed-off-by: Jan Lasek <jlasek@nvidia.com>
    
    ---------
    
    Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
    Signed-off-by: Jan Lasek <jlasek@nvidia.com>
    janekl authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    0d1e460 View commit details
    Browse the repository at this point in the history
  15. Add WandbPlugin, NsysPlugin and PreemptionPlugin to nemo.lightning.ru…

    …n.plugins (#10223)
    
    * Add WandbPlugin, NsysPlugin and PreemptionPlugin to nemo.lightning.run.plugins
    
    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
    
    * Remove duplicate
    
    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    
    * Add entity to wandb logger
    
    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    
    * Add documentation
    
    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
    
    * Add warning
    
    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
    
    * PR feedback
    
    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
    
    * Add comments
    
    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    
    * Apply isort and black reformatting
    
    Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
    
    ---------
    
    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
    Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>
    2 people authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    f131db2 View commit details
    Browse the repository at this point in the history
  16. [NeMo-UX] Handle absolute logger directories in nemo_logger (#10259)

    * handle absolute and relative logger directories
    
    Signed-off-by: Anna Shors <ashors@nvidia.com>
    
    * merge lines
    
    Signed-off-by: ashors1 <ashors@nvidia.com>
    
    ---------
    
    Signed-off-by: Anna Shors <ashors@nvidia.com>
    Signed-off-by: ashors1 <ashors@nvidia.com>
    ashors1 authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    86dcd99 View commit details
    Browse the repository at this point in the history
  17. Add sdxl notebook (#10139)

    * Add sdxl notebook
    
    Signed-off-by: mingyuanm <mingyuanm@nvidia.com>
    
    * Rename
    
    Signed-off-by: mingyuanm <mingyuanm@nvidia.com>
    
    * final Update SDXL notebook
    
    Signed-off-by: mingyuanm <mingyuanm@nvidia.com>
    
    ---------
    
    Signed-off-by: mingyuanm <mingyuanm@nvidia.com>
    Victor49152 authored and Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    97ce34a View commit details
    Browse the repository at this point in the history
  18. Updating some coments

    Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    c52a0a4 View commit details
    Browse the repository at this point in the history
  19. Apply isort and black reformatting

    Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
    shanmugamr1992 committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    ed26d89 View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    046a6ed View commit details
    Browse the repository at this point in the history
  21. Updating some coments

    Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    e3c5283 View commit details
    Browse the repository at this point in the history
  22. Apply isort and black reformatting

    Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
    shanmugamr1992 committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    1b07bd1 View commit details
    Browse the repository at this point in the history
  23. Updating some coments

    Shanmugam Ramasamy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    3c1e2c1 View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2024

  1. Configuration menu
    Copy the full SHA
    0c13c83 View commit details
    Browse the repository at this point in the history

Commits on Sep 16, 2024

  1. Small change

    Shanmugam Ramasamy committed Sep 16, 2024
    Configuration menu
    Copy the full SHA
    25b0e95 View commit details
    Browse the repository at this point in the history
  2. Apply isort and black reformatting

    Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
    shanmugamr1992 committed Sep 16, 2024
    Configuration menu
    Copy the full SHA
    f70c1da View commit details
    Browse the repository at this point in the history

Commits on Sep 23, 2024

  1. Rebase and integrate latest mcore changes

    Shanmugam Ramasamy committed Sep 23, 2024
    Configuration menu
    Copy the full SHA
    57bb895 View commit details
    Browse the repository at this point in the history
  2. Apply isort and black reformatting

    Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
    shanmugamr1992 committed Sep 23, 2024
    Configuration menu
    Copy the full SHA
    822ec5b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8600d31 View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2024

  1. Configuration menu
    Copy the full SHA
    a691e55 View commit details
    Browse the repository at this point in the history

Commits on Sep 25, 2024

  1. ADD support for layernorm1p

    Shanmugam Ramasamy committed Sep 25, 2024
    Configuration menu
    Copy the full SHA
    e05fe2c View commit details
    Browse the repository at this point in the history
  2. Apply isort and black reformatting

    Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
    shanmugamr1992 committed Sep 25, 2024
    Configuration menu
    Copy the full SHA
    28a0eb5 View commit details
    Browse the repository at this point in the history

Commits on Sep 26, 2024

  1. Configuration menu
    Copy the full SHA
    6d20aed View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    aaa4a09 View commit details
    Browse the repository at this point in the history

Commits on Sep 27, 2024

  1. Configuration menu
    Copy the full SHA
    370945e View commit details
    Browse the repository at this point in the history
  2. Update Dockerfile.ci

    Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
    shanmugamr1992 authored Sep 27, 2024
    Configuration menu
    Copy the full SHA
    68c635e View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    15bc02f View commit details
    Browse the repository at this point in the history

Commits on Sep 30, 2024

  1. Configuration menu
    Copy the full SHA
    b7d40da View commit details
    Browse the repository at this point in the history

Commits on Oct 1, 2024

  1. Update Dockerfile.ci

    Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
    shanmugamr1992 authored Oct 1, 2024
    Configuration menu
    Copy the full SHA
    7ee86bf View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    56654b0 View commit details
    Browse the repository at this point in the history

Commits on Oct 2, 2024

  1. Configuration menu
    Copy the full SHA
    f945350 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2024

  1. Merge branch 'main' into integrate_mcore_export

    Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
    shanmugamr1992 authored Oct 3, 2024
    Configuration menu
    Copy the full SHA
    e12bf8f View commit details
    Browse the repository at this point in the history
  2. Update Dockerfile.ci

    Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
    shanmugamr1992 authored Oct 3, 2024
    Configuration menu
    Copy the full SHA
    1d3a5ad View commit details
    Browse the repository at this point in the history

Commits on Oct 8, 2024

  1. Merge branch 'main' into integrate_mcore_export

    Shanmugam Ramasamy committed Oct 8, 2024
    Configuration menu
    Copy the full SHA
    cf7db10 View commit details
    Browse the repository at this point in the history