Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] TypeError: 'NoneType' object is not a mapping when running dp --pt train in devel branch #3911

Closed
Chengqian-Zhang opened this issue Jun 26, 2024 · 0 comments · Fixed by #3912
Labels

Comments

@Chengqian-Zhang
Copy link
Collaborator

Bug summary

When I run examples/water/dpa2 using dp --pt train input_torch.json. An error occurs:
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
[2024-06-26 07:53:43,325] DEEPMD INFO DeepMD version: 2.2.0b1.dev892+g73dab63f.d20240612
[2024-06-26 07:53:43,325] DEEPMD INFO Configuration path: input_torch.json
Traceback (most recent call last):
File "/home/data/zhangcq/conda_env/deepmd-pt-1026/bin/dp", line 8, in
sys.exit(main())
File "/home/data/zcq/deepmd-source/deepmd-kit/deepmd/main.py", line 842, in main
deepmd_main(args)
File "/home/data/zhangcq/conda_env/deepmd-pt-1026/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/data/zcq/deepmd-source/deepmd-kit/deepmd/pt/entrypoints/main.py", line 384, in main
train(FLAGS)
File "/home/data/zcq/deepmd-source/deepmd-kit/deepmd/pt/entrypoints/main.py", line 223, in train
SummaryPrinter()()
File "/home/data/zcq/deepmd-source/deepmd-kit/deepmd/utils/summary.py", line 62, in call
build_info.update(self.get_backend_info())
File "/home/data/zcq/deepmd-source/deepmd-kit/deepmd/pt/entrypoints/main.py", line 213, in get_backend_info
return {
TypeError: 'NoneType' object is not a mapping

I have found the reason. This bug is made by PR #3895
20240626-160240
When op_info is None, {**op_info} will raise error. I think changing op_info = None to op_info = {} will solve the issue. I will open another PR to solve this issue.

DeePMD-kit Version

newest devel

Backend and its version

pytorch

How did you download the software?

Offline packages

Input Files, Running Commands, Error Log, etc.

See above

Steps to Reproduce

See above

Further Information, Files, and Links

See above

github-merge-queue bot pushed a commit that referenced this issue Jun 26, 2024
…fo = None` to `op_info = {}` (#3912)

Solve issue #3911 

When I run `examples/water/dpa2` using `dp --pt train input_torch.json`.
An error occurs:
To get the best performance, it is recommended to adjust the number of
threads by setting the environment variables OMP_NUM_THREADS,
DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS.
See https://deepmd.rtfd.io/parallelism/ for more information.
[2024-06-26 07:53:43,325] DEEPMD INFO DeepMD version:
2.2.0b1.dev892+g73dab63f.d20240612
[2024-06-26 07:53:43,325] DEEPMD INFO Configuration path:
input_torch.json
Traceback (most recent call last):
File "/home/data/zhangcq/conda_env/deepmd-pt-1026/bin/dp", line 8, in
<module>
    sys.exit(main())
File "/home/data/zcq/deepmd-source/deepmd-kit/deepmd/main.py", line 842,
in main
    deepmd_main(args)
File
"/home/data/zhangcq/conda_env/deepmd-pt-1026/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py",
line 346, in wrapper
    return f(*args, **kwargs)
File
"/home/data/zcq/deepmd-source/deepmd-kit/deepmd/pt/entrypoints/main.py",
line 384, in main
    train(FLAGS)
File
"/home/data/zcq/deepmd-source/deepmd-kit/deepmd/pt/entrypoints/main.py",
line 223, in train
    SummaryPrinter()()
File "/home/data/zcq/deepmd-source/deepmd-kit/deepmd/utils/summary.py",
line 62, in __call__
    build_info.update(self.get_backend_info())
File
"/home/data/zcq/deepmd-source/deepmd-kit/deepmd/pt/entrypoints/main.py",
line 213, in get_backend_info
    return {
TypeError: 'NoneType' object is not a mapping

This bug is made by PR #3895 

![20240626-160240](https://github.com/deepmodeling/deepmd-kit/assets/100290172/92008b01-1e3d-437d-a09e-cc74b2da6412)
When `op_info` is `None`, `{**op_info}` will raise error. Changing
`op_info = None` to `op_info = {}` will solve the issue.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved system stability by initializing `op_info` as an empty
dictionary instead of `None`, preventing potential runtime errors.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@njzjz njzjz closed this as completed Jun 26, 2024
mtaillefumier pushed a commit to mtaillefumier/deepmd-kit that referenced this issue Sep 18, 2024
…fo = None` to `op_info = {}` (deepmodeling#3912)

Solve issue deepmodeling#3911 

When I run `examples/water/dpa2` using `dp --pt train input_torch.json`.
An error occurs:
To get the best performance, it is recommended to adjust the number of
threads by setting the environment variables OMP_NUM_THREADS,
DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS.
See https://deepmd.rtfd.io/parallelism/ for more information.
[2024-06-26 07:53:43,325] DEEPMD INFO DeepMD version:
2.2.0b1.dev892+g73dab63f.d20240612
[2024-06-26 07:53:43,325] DEEPMD INFO Configuration path:
input_torch.json
Traceback (most recent call last):
File "/home/data/zhangcq/conda_env/deepmd-pt-1026/bin/dp", line 8, in
<module>
    sys.exit(main())
File "/home/data/zcq/deepmd-source/deepmd-kit/deepmd/main.py", line 842,
in main
    deepmd_main(args)
File
"/home/data/zhangcq/conda_env/deepmd-pt-1026/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py",
line 346, in wrapper
    return f(*args, **kwargs)
File
"/home/data/zcq/deepmd-source/deepmd-kit/deepmd/pt/entrypoints/main.py",
line 384, in main
    train(FLAGS)
File
"/home/data/zcq/deepmd-source/deepmd-kit/deepmd/pt/entrypoints/main.py",
line 223, in train
    SummaryPrinter()()
File "/home/data/zcq/deepmd-source/deepmd-kit/deepmd/utils/summary.py",
line 62, in __call__
    build_info.update(self.get_backend_info())
File
"/home/data/zcq/deepmd-source/deepmd-kit/deepmd/pt/entrypoints/main.py",
line 213, in get_backend_info
    return {
TypeError: 'NoneType' object is not a mapping

This bug is made by PR deepmodeling#3895 

![20240626-160240](https://github.com/deepmodeling/deepmd-kit/assets/100290172/92008b01-1e3d-437d-a09e-cc74b2da6412)
When `op_info` is `None`, `{**op_info}` will raise error. Changing
`op_info = None` to `op_info = {}` will solve the issue.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved system stability by initializing `op_info` as an empty
dictionary instead of `None`, preventing potential runtime errors.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants