Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The "serial_pipeline" method has a small bug #592

Closed
Tracked by #548
youhu868 opened this issue Feb 24, 2023 · 2 comments
Closed
Tracked by #548

The "serial_pipeline" method has a small bug #592

youhu868 opened this issue Feb 24, 2023 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@youhu868
Copy link

I run the "gym_hybrid_pdqn_config" demo with adding "multi_pass=True,action_mask=[[1, 0], [0, 1], [0, 0]]" code for mpdqn, it throw an AttributeError error. The log is:

[02-24 16:42:20] INFO                                                                                                                  interaction_serial_evaluator.py:279
                        +-------+---------------+--------------------------+---------------+---------------+                                                            
                        | Name  | train_iter    | ckpt_name                | episode_count | envstep_count |                                                            
                        +-------+---------------+--------------------------+---------------+---------------+                                                            
                        | Value | 562000.000000 | iteration_562000.pth.tar | 5.000000      | 961.000000    |                                                            
                        +-------+---------------+--------------------------+---------------+---------------+                                                            
                        +-------+-------------------------+---------------+---------------------+----------------------+                                                
                        | Name  | avg_envstep_per_episode | evaluate_time | avg_envstep_per_sec | avg_time_per_episode |                                                
                        +-------+-------------------------+---------------+---------------------+----------------------+                                                
                        | Value | 192.200000              | 0.271575      | 3538.612792         | 18.411097            |                                                
                        +-------+-------------------------+---------------+---------------------+----------------------+                                                
                        +-------+-------------+------------+------------+------------+                                                                                  
                        | Name  | reward_mean | reward_std | reward_max | reward_min |                                                                                  
                        +-------+-------------+------------+------------+------------+                                                                                  
                        | Value | 1.808769    | 0.387593   | 2.428377   | 1.418769   |                                                                                  
                        +-------+-------------+------------+------------+------------+                                                                                  
                                                                                                                                                                        
                                                                                                                                                                        
[02-24 16:42:20] INFO     [RANK0]: learner save ckpt in ./gym_hybrid_pdqn_seed0_230224_150649/ckpt/ckpt_best.pth.tar                                   base_learner.py:338
[02-24 16:42:20] INFO     [DI-engine serial pipeline] Current episode_return: 1.8088 is greater than stop_value: 1.8, so your RL agent interaction_serial_evaluator.py:303
                        is converged, you can refer to 'log/evaluator/evaluator_logger.txt' for details.                                                                
[02-24 16:42:20] INFO     [RANK0]: learner save ckpt in ./gym_hybrid_pdqn_seed0_230224_150649/ckpt/iteration_562000.pth.tar                            base_learner.py:338
/home/xxx/anaconda3/envs/py310/lib/python3.10/site-packages/numpy/core/_methods.py:164: FutureWarning: The input object of type 'Tensor' is an array-like implementing one of the corresponding protocols (`__array__`, `__array_interface__` or `__array_struct__`); but not a sequence (or 0-D). In the future, this object will be coerced as if it was first converted using `np.array(obj)`. To retain the old behaviour, you have to either modify the type 'Tensor', or assign to an empty array created with `np.empty(correct_shape, dtype=object)`.
arr = asanyarray(a)
Traceback (most recent call last):
File "/home/xxx/PycharmProjects/DI-engine/dizoo/gym_hybrid/config/gym_hybrid_pdqn_config.py", line 84, in <module>
  serial_pipeline([main_config, create_config], seed=0, max_env_step=int(1e7))
File "/home/xxx/PycharmProjects/DI-engine/ding/entry/serial_entry.py", line 129, in serial_pipeline
  'eval_value': np.mean(eval_value_raw),
File "<__array_function__ internals>", line 180, in mean
File "/home/xxx/anaconda3/envs/py310/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 3432, in mean
  return _methods._mean(a, axis=axis, dtype=dtype,
File "/home/xxx/anaconda3/envs/py310/lib/python3.10/site-packages/numpy/core/_methods.py", line 190, in _mean
  ret = ret.dtype.type(ret / rcount)
AttributeError: 'torch.dtype' object has no attribute 'type'
Exception ignored in: <function SampleSerialCollector.__del__ at 0x7f3180e07d90>
Traceback (most recent call last):
File "/home/xxx/PycharmProjects/DI-engine/ding/worker/collector/sample_serial_collector.py", line 195, in __del__
  self.close()
File "/home/xxx/PycharmProjects/DI-engine/ding/worker/collector/sample_serial_collector.py", line 185, in close
  self._env.close()
File "/home/xxx/PycharmProjects/DI-engine/ding/envs/env_manager/subprocess_env_manager.py", line 634, in close
  p.send(['close', None, None])
File "/home/xxx/anaconda3/envs/py310/lib/python3.10/multiprocessing/connection.py", line 211, in send
  self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/xxx/anaconda3/envs/py310/lib/python3.10/multiprocessing/connection.py", line 416, in _send_bytes
  self._send(header + buf)
File "/home/xxx/anaconda3/envs/py310/lib/python3.10/multiprocessing/connection.py", line 373, in _send
  n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: <function InteractionSerialEvaluator.__del__ at 0x7f3180c364d0>
Traceback (most recent call last):
File "/home/xxx/PycharmProjects/DI-engine/ding/worker/collector/interaction_serial_evaluator.py", line 158, in __del__
  self.close()
File "/home/xxx/PycharmProjects/DI-engine/ding/worker/collector/interaction_serial_evaluator.py", line 147, in close
  self._env.close()
File "/home/xxx/PycharmProjects/DI-engine/ding/envs/env_manager/subprocess_env_manager.py", line 634, in close
  p.send(['close', None, None])
File "/home/xxx/anaconda3/envs/py310/lib/python3.10/multiprocessing/connection.py", line 211, in send
  self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/xxx/anaconda3/envs/py310/lib/python3.10/multiprocessing/connection.py", line 416, in _send_bytes
  self._send(header + buf)
File "/home/xxx/anaconda3/envs/py310/lib/python3.10/multiprocessing/connection.py", line 373, in _send
  n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

====================================================================================================
This bug is because the type of element in the eval_value_raw list is torch.Tensor, we should put the float data in np.mean().So we can fix it by changing

eval_value_raw = [d['eval_episode_return'] for d in eval_info]

to

eval_value_raw = [d['eval_episode_return'].item() for d in eval_info]
@PaParaZz1 PaParaZz1 added the bug Something isn't working label Feb 24, 2023
@PaParaZz1 PaParaZz1 self-assigned this Feb 24, 2023
@PaParaZz1
Copy link
Member

I think it would be better to modify the type of return info in evaluator, I have fixed this problem in above mentioned PR.

@youhu868
Copy link
Author

OK, It is better.At first, I wasn't sure whether Evaluator's return_info was used elsewhere

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants