Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: cannot import name 'MoESubmodules' from 'megatron.core.transformer.moe.moe_layer' #483

Open
Cppowboy opened this issue Jan 16, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@Cppowboy
Copy link

I'm attempting to upgrade to the newest version of NeMo-Aligner, but I've run into this issue.

Traceback (most recent call last):
  File "/user/panyinxu/workspace/NeMo-Aligner/examples/nlp/minicpm/train_minicpm3_4b_longrope_dpo.py", line 164, in <module>
    main()
  File "/user/panyinxu/workspace/NeMo/nemo/core/config/hydra_runner.py", line 129, in wrapper
    _run_hydra(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 394, in _run_hydra
    _run_app(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 457, in _run_app
    run_and_report(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 223, in run_and_report
    raise ex
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 220, in run_and_report
    return func()
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 458, in <lambda>
    lambda: hydra.run(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
  File "/usr/local/lib/python3.10/dist-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
  File "/usr/local/lib/python3.10/dist-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
  File "/user/panyinxu/workspace/NeMo-Aligner/examples/nlp/minicpm/train_minicpm3_4b_longrope_dpo.py", line 56, in main
    ptl_model = load_from_nemo(
  File "/user/panyinxu/workspace/NeMo-Aligner/nemo_aligner/utils/utils.py", line 118, in load_from_nemo
    model = cls.restore_from(
  File "/user/panyinxu/workspace/NeMo/nemo/collections/nlp/models/nlp_model.py", line 493, in restore_from
    return super().restore_from(
  File "/user/panyinxu/workspace/NeMo/nemo/core/classes/modelPT.py", line 474, in restore_from
    instance = cls._save_restore_connector.restore_from(
  File "/user/panyinxu/workspace/NeMo-Aligner/nemo_aligner/utils/utils.py", line 56, in restore_from
    return super().restore_from(*args, replace_sharded_tensor_key=self.__replace_sharded_tensor_key, **kwargs)
  File "/user/panyinxu/workspace/NeMo/nemo/collections/nlp/parts/nlp_overrides.py", line 1304, in restore_from
    loaded_params = super().load_config_and_state_dict(
  File "/user/panyinxu/workspace/NeMo/nemo/core/connectors/save_restore_connector.py", line 182, in load_config_and_state_dict
    instance = calling_cls.from_config_dict(config=conf, trainer=trainer)
  File "/user/panyinxu/workspace/NeMo/nemo/core/classes/common.py", line 530, in from_config_dict
    raise e
  File "/user/panyinxu/workspace/NeMo/nemo/core/classes/common.py", line 522, in from_config_dict
    instance = cls(cfg=config, trainer=trainer)
  File "/user/panyinxu/workspace/NeMo-Aligner/nemo_aligner/models/nlp/gpt/megatron_gpt_dpo_model.py", line 54, in __init__
    super().__init__(cfg, trainer=trainer)
  File "/user/panyinxu/workspace/NeMo/nemo/collections/nlp/parts/mixins/nlp_adapter_mixins.py", line 88, in __init__
    super().__init__(*args, **kwargs)
  File "/user/panyinxu/workspace/NeMo/nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py", line 401, in __init__
    self.model = build_model(
  File "/user/panyinxu/workspace/NeMo/nemo/collections/nlp/modules/common/megatron/build_model.py", line 90, in build_model
    model = model_provider_func(
  File "/user/panyinxu/workspace/NeMo/nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py", line 483, in model_provider_func
    transformer_layer_spec=get_specs(
  File "/user/panyinxu/workspace/NeMo/nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py", line 172, in get_specs
    "modelopt": get_gpt_layer_modelopt_spec(num_experts),
  File "/user/panyinxu/workspace/NeMo/nemo/collections/nlp/models/language_modeling/megatron/gpt_layer_modelopt_spec.py", line 50, in get_gpt_layer_modelopt_spec
    raise IMPORT_ERROR
  File "/user/panyinxu/workspace/NeMo/nemo/collections/nlp/models/language_modeling/megatron/gpt_layer_modelopt_spec.py", line 23, in <module>
    from megatron.core.transformer.moe.moe_layer import MoELayer, MoESubmodules
ImportError: cannot import name 'MoESubmodules' from 'megatron.core.transformer.moe.moe_layer' (/user/panyinxu/workspace/Megatron-LM/megatron/core/transformer/moe/moe_layer.py)

other information

NeMo-Aligner: 0.6.0
NeMo: 2.1.0
Megatron-LM: 0.9.0
docker: nemo 24-12
@Cppowboy Cppowboy added the bug Something isn't working label Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant