Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training ends shortly after entering play #5999

Closed
popcron opened this issue Oct 19, 2023 · 4 comments
Closed

Training ends shortly after entering play #5999

popcron opened this issue Oct 19, 2023 · 4 comments
Labels
bug Issue describes a potential bug in ml-agents.

Comments

@popcron
Copy link

popcron commented Oct 19, 2023

Describe the bug
When running mlagents-learn (with and without --force) and entering play, it exits almost immediately after and prints Debug.Log calls about 33 times

To Reproduce
Steps to reproduce the behavior:

  1. Start training with the command and press play
  2. Observe it close, with traceback errors in console output

Console logs / stack traces

PS C:\repos\the big game\Saw and UFO> mlagents-learn --force
[W ..\torch\csrc\utils\tensor_numpy.cpp:77] Warning: Failed to initialize NumPy: module compiled against API version 0x10 but this version of numpy is 0xe (function operator ())

            ┐  ╖
        ╓╖╬│╡  ││╬╖╖
    ╓╖╬│││││┘  ╬│││││╬╖
 ╖╬│││││╬╜        ╙╬│││││╖╖                               ╗╗╗
 ╬╬╬╬╖││╦╖        ╖╬││╗╣╣╣╬      ╟╣╣╬    ╟╣╣╣             ╜╜╜  ╟╣╣
 ╬╬╬╬╬╬╬╬╖│╬╖╖╓╬╪│╓╣╣╣╣╣╣╣╬      ╟╣╣╬    ╟╣╣╣ ╒╣╣╖╗╣╣╣╗   ╣╣╣ ╣╣╣╣╣╣ ╟╣╣╖   ╣╣╣
 ╬╬╬╬┐  ╙╬╬╬╬│╓╣╣╣╝╜  ╫╣╣╣╬      ╟╣╣╬    ╟╣╣╣ ╟╣╣╣╙ ╙╣╣╣  ╣╣╣ ╙╟╣╣╜╙  ╫╣╣  ╟╣╣
 ╬╬╬╬┐     ╙╬╬╣╣      ╫╣╣╣╬      ╟╣╣╬    ╟╣╣╣ ╟╣╣╬   ╣╣╣  ╣╣╣  ╟╣╣     ╣╣╣┌╣╣╜
 ╬╬╬╜       ╬╬╣╣      ╙╝╣╣╬      ╙╣╣╣╗╖╓╗╣╣╣╜ ╟╣╣╬   ╣╣╣  ╣╣╣  ╟╣╣╦╓    ╣╣╣╣╣
 ╙   ╓╦╖    ╬╬╣╣   ╓╗╗╖            ╙╝╣╣╣╣╝╜   ╘╝╝╜   ╝╝╝  ╝╝╝   ╙╣╣╣    ╟╣╣╣
   ╩╬╬╬╬╬╬╦╦╬╬╣╣╗╣╣╣╣╣╣╣╝                                             ╫╣╣╣╣
      ╙╬╬╬╬╬╬╬╣╣╣╣╣╣╝╜
          ╙╬╬╬╣╣╣╜
             ╙

 Version information:
  ml-agents: 0.30.0,
  ml-agents-envs: 0.30.0,
  Communicator API: 1.5.0,
  PyTorch: 1.13.1+cpu
[W ..\torch\csrc\utils\tensor_numpy.cpp:77] Warning: Failed to initialize NumPy: module compiled against API version 0x10 but this version of numpy is 0xe (function operator ())
[INFO] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
[INFO] Connected to Unity environment with package version 3.0.0-exp.1 and communication version 1.5.0
[INFO] Connected new brain: Cell?team=0
[WARNING] Behavior name Cell does not match any behaviors specified in the trainer configuration file. A default configuration will be used.
[WARNING] Deleting TensorBoard data events.out.tfevents.1697750225.pop.26916.0 that was left over from a previous run.
[INFO] Hyperparameters for behavior name Cell:
        trainer_type:   ppo
        hyperparameters:
          batch_size:   1024
          buffer_size:  10240
          learning_rate:        0.0003
          beta: 0.005
          epsilon:      0.2
          lambd:        0.95
          num_epoch:    3
          shared_critic:        False
          learning_rate_schedule:       linear
          beta_schedule:        linear
          epsilon_schedule:     linear
        network_settings:
          normalize:    False
          hidden_units: 128
          num_layers:   2
          vis_encode_type:      simple
          memory:       None
          goal_conditioning_type:       hyper
          deterministic:        False
        reward_signals:
          extrinsic:
            gamma:      0.99
            strength:   1.0
            network_settings:
              normalize:        False
              hidden_units:     128
              num_layers:       2
              vis_encode_type:  simple
              memory:   None
              goal_conditioning_type:   hyper
              deterministic:    False
        init_path:      None
        keep_checkpoints:       5
        checkpoint_interval:    500000
        max_steps:      500000
        time_horizon:   64
        summary_freq:   50000
        threaded:       False
        self_play:      None
        behavioral_cloning:     None
[INFO] Exported results\ppo\Cell\Cell-0.onnx
[INFO] Copied results\ppo\Cell\Cell-0.onnx to results\ppo\Cell.onnx.
Traceback (most recent call last):
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\learn.py", line 264, in main
    run_cli(parse_command_line())
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\learn.py", line 260, in run_cli
    run_training(run_seed, options, num_areas)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\learn.py", line 136, in run_training
    tc.start_learning(env_manager)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\trainer_controller.py", line 175, in start_learning
    n_steps = self.advance(env_manager)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\trainer_controller.py", line 233, in advance
    new_step_infos = env_manager.get_steps()
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\env_manager.py", line 124, in get_steps
    new_step_infos = self._step()
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 408, in _step
    self._queue_steps()
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 302, in _queue_steps
    env_action_info = self._take_step(env_worker.previous_step)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 543, in _take_step
    all_action_info[brain_name] = self.policies[brain_name].get_action(
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\policy\torch_policy.py", line 130, in get_action
    run_out = self.evaluate(decision_requests, global_agent_ids)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\policy\torch_policy.py", line 93, in evaluate
    masks = self._extract_masks(decision_requests)
  File "C:\Users\phill\AppData\Local\Programs\Python\Python310\lib\site-packages\mlagents\trainers\policy\torch_policy.py", line 77, in _extract_masks
    mask = torch.as_tensor(
RuntimeError: Could not infer dtype of numpy.int32

Environment (please complete the following information):

  • Unity 2023.3.0a10
  • Windows 11, Torch 1.13.1+cpu, Python 3.10.0, numpy 1.12.1
  • Package source for mlagents and mlagents.extensions are from develop branch (as upm references)
@popcron popcron added the bug Issue describes a potential bug in ml-agents. label Oct 19, 2023
@popcron
Copy link
Author

popcron commented Oct 19, 2023

this is also technically separate, but had to pip install protobuf==3.20.1 as well to remove some of the issues mentioned here), im not sure if this actually stopped it from running but any mlagents-learn command would've thrown errors otherwise this was ok

The packages installed werent from the repo when i tested originally, installing the two packages in the correct order instead also results the same

@PatrickM92
Copy link

I'm getting this same issue after following the install instructions here

I just started a new venv and ran only these commands:

Ran the 3DBall Test from the into docs and it errors out after I hit play in unity with "[W ..\torch\csrc\utils\tensor_numpy.cpp:77] Warning: Failed to initialize NumPy: module compiled against API version 0x10 but this version of numpy is 0xe (function operator ())"

torch: 1.13.1+cu117
numpy: 1.21.2
mlagents: 1.0.0 (pulled from release21)
mlagents-envs: 1.0.0 (pulled from release21)
Unity 2022.3.4f1

@xyz2022
Copy link
Contributor

xyz2022 commented Oct 28, 2023

Solution here: #6002
TLDR:
numpy 1.21.x and 1.22.x aren't compatible with the install guide you followed.
numpy 1.23.1 works fine. You need to edit ./ml-agents-dev/setup.py and ./ml-agents/setup.py

OP, python 3.10.0 I didn't test, but I anticipate it may be a problem. In addition to updating numpy, you may need to update python version. My test included python 3.10.13 (from conda)
The final binary version of python 3.10 is https://www.python.org/downloads/release/python-31011/
however it also might be a problem (the guide says version 3.10.12+ is required).

@miguelalonsojr
Copy link
Collaborator

Installing directly from the develop branch of the repo should resolve this issue and was taken care of with this PR: #5997

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue describes a potential bug in ml-agents.
Projects
None yet
Development

No branches or pull requests

4 participants