Tensorboard files not saving when using SubprocVecEnv #1205

atapley · 2022-12-07T13:52:15Z

🐛 Bug

When I train my model with a normal Monitor wrapped env I get the output tensorboard files as expected, but when I use a SubprocVecEnv with multiple parallel environments nothing seems to get logged to the tensorboard file. Is this expected when using SubprocVecEnv given the multiple environments?

Code example

My custom environment is a wrapper for an external modeler, so I can't provide a code sample that would be able to run below. I make a few changes to the DQN algorithm (adding an intrinsic curiosity module) but the tensorboard files save fine when running in a solo environment so I don't think that would be the issue. I added the reset and step methods below in case they help but they likely won't make sense without the necessary modeler.

import gym
import numpy as np

from stable_baselines3 import DQN
from stable_baselines3.common.env_checker import check_env


class CustomEnv(gym.Env):

  def __init__(
        self,
        simulation: Simulation,
        movements: List[str],
        interactions: List[str],
        attributes: List[str],
        normalized_attributes: List[str],
        deterministic: bool = False
    ) -> None:
        self.simulation = simulation
        self.movements = copy.deepcopy(movements)
        self.interactions = copy.deepcopy(interactions)
        self.attributes = attributes
        self.normalized_attributes = normalized_attributes
        self.deterministic = deterministic
        

        # ------------------

        if not set(self.normalized_attributes).issubset(self.attributes):
            raise AssertionError(
                f"All normalized attributes ({str(self.normalized_attributes)}) must be "
                f"in attributes ({str(self.attributes)})!"
            )

        # ------------------
        self.sim_agent_id = len(movements) + len(interactions) + len(interactions) + 2
        sim_attributes = self.simulation.get_attribute_data()
        sim_actions = self.simulation.get_actions()

        if not set(self.interactions).issubset(list(sim_actions.keys())):
            raise AssertionError(
                f"All interactions ({str(self.interactions)}) must be "
                f"in the simulator's actions ({str(list(sim_actions.keys()))})!"
            )
            
        self.interactions.insert(0, "none")
        self.movements.insert(0, "none")

        self._separate_sim_nonsim(sim_attributes)
        self.harness_to_sim, self.sim_to_harness = self._sim_harness_conv(sim_actions)
        self.min_maxes = self._get_min_maxes()

        # ------------------

        channel_lows = np.array(
            [[[self.min_maxes[channel]["min"]]] for channel in self.attributes]
        )
        channel_highs = np.array(
            [[[self.min_maxes[channel]["max"]]] for channel in self.attributes]
        )

        self.low = np.repeat(
            np.repeat(channel_lows, self.simulation.config.area.screen_size, axis=1),
            self.simulation.config.area.screen_size,
            axis=2,
        )

        self.high = np.repeat(
            np.repeat(channel_highs, self.simulation.config.area.screen_size, axis=1),
            self.simulation.config.area.screen_size,
            axis=2,
        )

        self.observation_space = gym.spaces.Box(
            np.float32(self.low),
            np.float32(self.high),
            shape=(
                len(self.attributes),
                self.simulation.config.area.screen_size,
                self.simulation.config.area.screen_size,
            ),
            dtype=np.float64,
        )

        action_shape = len(self.movements) * len(self.interactions)
        self.action_space = gym.spaces.Discrete(action_shape)

  def reset(self):
    self.num_burned = 0
    if not self.deterministic:
            fire_init_seed = self.simulation.get_seeds()["fire_initial_position"]
            elevation_seed = self.simulation.get_seeds()["elevation"]
            seed_dict = {"fire_initial_position": fire_init_seed + 1,
                         "elevation": elevation_seed + 1}
            self.simulation.set_seeds(seed_dict)
        
        self.simulation.reset()
        sim_observations = self._select_from_dict(
            self.simulation.get_attribute_data(), self.sim_attributes
        )
        nonsim_observations = self._select_from_dict(
            self.get_nonsim_attribute_data(), self.nonsim_attributes
        )

        if len(nonsim_observations) != len(self.nonsim_attributes):
            raise AssertionError(
                f"Data for {len(nonsim_observations)} nonsim attributes were given but "
                f"there are {len(self.nonsim_attributes)} nonsim attributes."
            )

        observations = self._normalize_obs({**sim_observations, **nonsim_observations})

        obs = []
        for attribute in self.attributes:
            obs.append(observations[attribute])

        self.state = np.stack(obs, axis=0).astype(np.float64)

        output = self.state

        point = [self.agent_pos[1], self.agent_pos[0], 0]
        self.simulation.update_agent_positions([point])
        self.num_agent_steps = 0
        return output

  def step(self, action):
        movement = (action % len(self.movements))
        movement_str = self.movements[movement]
        
        interaction = int(action / len(self.movements))
        interaction_str = self.interactions[interaction]
        
        reward = 0.0

        pos_placeholder = self.agent_pos.copy()
        screen_size = self.simulation.config.area.screen_size

        if movement_str == "none":
            pass
        elif movement_str == "up" and not self.agent_pos[0] == 0:
            pos_placeholder[0] -= 1
        elif movement_str == "down" and not self.agent_pos[0] == screen_size - 1:
            pos_placeholder[0] += 1
        elif movement_str == "left" and not self.agent_pos[1] == 0:
            pos_placeholder[1] -= 1
        elif movement_str == "right" and not self.agent_pos[1] == screen_size - 1:
            pos_placeholder[1] += 1
        else:
            pass

        self.agent_pos = pos_placeholder

        fire_map_idx = self.attributes.index("fire_map")
        is_empty = self.state[fire_map_idx][self.agent_pos[0]][self.agent_pos[1]] == 0

        if is_empty and not interaction_str == 'none':
            sim_interaction = self.harness_to_sim[interaction]
            mitigation_update = (self.agent_pos[1], self.agent_pos[0], sim_interaction)
            self.simulation.update_mitigation([mitigation_update])

        point = [self.agent_pos[1], self.agent_pos[0], 0]
        self.simulation.update_agent_positions([point])

        if self.num_agent_steps % self.agent_speed == 0:
            sim_fire_map, sim_active = self.simulation.run(1)
            fire_map = np.copy(sim_fire_map)
            fire_map[self.agent_pos[0]][self.agent_pos[1]] = self.sim_agent_id
            reward += self._calculate_reward(fire_map)
        else:
            sim_active = True
            sim_fire_map = self.simulation.fire_map
            fire_map = np.copy(sim_fire_map)
            fire_map[self.agent_pos[0]][self.agent_pos[1]] = self.sim_agent_id

        self.state[fire_map_idx] = fire_map

        if not sim_active:
            reward += 10

        self.num_agent_steps += 1
        return self.state, reward, not sim_active, {}

env = CustomEnv()
check_env(env)

model = DQN("CNNPolicy", env, verbose=1).learn(1000)

Relevant log output / Error message

No response

System Info

OS: Linux-5.4.0-80-generic-x86_64-with-glibc2.27 #90~18.04.1-Ubuntu SMP Tue Jul 13 19:40:02 UTC 2021
Python: 3.9.15
Stable-Baselines3: 1.6.2
PyTorch: 1.12.1+cu102
GPU Enabled: False
Numpy: 1.22.4
Gym: 0.21.0

({'OS': 'Linux-5.4.0-80-generic-x86_64-with-glibc2.27 #9018.04.1-Ubuntu SMP Tue Jul 13 19:40:02 UTC 2021', 'Python': '3.9.15', 'Stable-Baselines3': '1.6.2', 'PyTorch': '1.12.1+cu102', 'GPU Enabled': 'False', 'Numpy': '1.22.4', 'Gym': '0.21.0'}, 'OS: Linux-5.4.0-80-generic-x86_64-with-glibc2.27 #9018.04.1-Ubuntu SMP Tue Jul 13 19:40:02 UTC 2021\nPython: 3.9.15\nStable-Baselines3: 1.6.2\nPyTorch: 1.12.1+cu102\nGPU Enabled: False\nNumpy: 1.22.4\nGym: 0.21.0\n')

Checklist

I have checked that there is no similar issue in the repo
I have read the documentation
I have provided a minimal working example to reproduce the bug
I have checked my env using the env checker
I've used the markdown code blocks for both code and stack traces.

qgallouedec · 2022-12-07T14:34:08Z

model = DQN("CNNPolicy", env, verbose=1).learn(1000)

Reading the code you give, I don't see any instruction for tensorboard, nor any use of the SubprocVecEnv. So I have trouble understanding your problem. Can you rephrase it?

The following works well for me:

# test_1205.py
from stable_baselines3 import DQN
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.vec_env import SubprocVecEnv

if __name__ == "__main__":
    venv = make_vec_env("CartPole-v1", vec_env_cls=SubprocVecEnv, n_envs=2)
    model = DQN("MlpPolicy", venv, tensorboard_log="./tensorboard")
    model.learn(1000)

$ python test_1205.py
$ ls tensorboard
DQN_1

atapley · 2022-12-07T15:34:19Z

Ah, forgot to change that part. The instantiation of the SubprocVecEnv is here

env = make_vec_env(<ENV>, env_kwargs=<ENV_KWARGS>, n_envs=2, seed=0, vec_env_cls=SubprocVecEnv)

There is nothing for tensorboard because I don't directly do anything with it. It should be taken care of by the stable-baselines3 code within the DQN_ICM's parent class's (OffPolicyAlgorithm) learn method I believe. The tensorboard stuff is all internal to stable-baselines, which is why I'm confused about why it isn't working within the subprocenv.

If I use the below env I am able to get the correct tensorboard.

env = Monitor(gym.make(<ENV>, <ENV_KWARGS>))

Similar to you, the tensorboard file itself does get saved. However, when I run Tensorboard there are no metrics that are saved to the file somehow.

qgallouedec · 2022-12-07T15:44:13Z

Another thing I can't figure out: does the problem occur only with your custom environment, or also with the other standard environments?

atapley · 2022-12-07T15:58:19Z

Just ran the sample code you commented and it appears to work as expected - I can see the tensorboard metrics within the file. So this appears to be an issue with just the CustomEnv, but I have not made any changes to the tensorboard code and the issue only appears when using SubprocVecEnv - I don't have the issue with the single env case.

One thing that might be relevant is that after training finishes, my code hangs and does not close out when using the SubprocVecEnv - kind of looks like a process or thread doesn't close properly. When I run through the debugger I don't have that issue surprisingly. Maybe since it hangs it doesn't save to tensorboard properly? Although it saves the files without issues so I don't know if that would be it

qgallouedec · 2022-12-07T16:13:11Z

So if I sum up simply, your problem is that this:

from stable_baselines3 import DQN
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.vec_env import SubprocVecEnv

class CustomEnv(gym.Env): ...

if __name__ == "__main__":
    venv = make_vec_env(CustomEnv, vec_env_cls=SubprocVecEnv, n_envs=2)
    model = DQN("MlpPolicy", venv, tensorboard_log="./tensorboard")
    model.learn(1000)

doesn't output anything on your tensorboard, right?

atapley · 2022-12-07T16:26:38Z

Yes, more or less. The above does not output anything to tensorboard but

from stable_baselines3 import DQN
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.vec_env import SubprocVecEnv

class CustomEnv(gym.Env): ...

if __name__ == "__main__":
    model = DQN("MlpPolicy", CustomEnv, tensorboard_log="./tensorboard")
    model.learn(1000)

does.

(Only difference is I use my modified DQN_ICM algorithm in both cases instead of base DQN)

qgallouedec · 2022-12-07T16:50:44Z

    model = DQN("MlpPolicy", CustomEnv, tensorboard_log="./tensorboard")

Ok, then the problem does not come from SubprocVecEnv since you do not use it. Correct?

atapley · 2022-12-07T17:00:35Z

Incorrect, I'm just saying that it does work when I do

model = DQN("MlpPolicy", CustomEnv, tensorboard_log="./tensorboard")

but does not when I do

venv = make_vec_env(CustomEnv, vec_env_cls=SubprocVecEnv, n_envs=2)
model = DQN("MlpPolicy", venv, tensorboard_log="./tensorboard")

And the SubprocVecEnv is what I am currently trying to work with

qgallouedec · 2022-12-07T17:05:28Z

Incorrect, I'm just saying that it does work when I do

Sorry I read the opposite.

(Only difference is I use my modified DQN_ICM algorithm in both cases instead of base DQN)

Does using DQN instead of DQN_ICM solve the problem?

atapley · 2022-12-07T17:10:51Z

Just gave it a try and looks like normal DQN does not work either

qgallouedec · 2022-12-07T17:20:10Z

Looking quickly at your code, I don't see anything that could explain this. In order for us to help you, we need to be able to reproduce the "error", so you need to provide a minimal code that allows it.
From what we've just discussed, it should look just like the one in #1205 (comment)

atapley · 2022-12-07T17:39:10Z

Okay, let me try and minimize the current code for an MVP and link the open-source modeler used in the environment.

qgallouedec · 2022-12-07T17:41:10Z

link the open-source modeler used in the environment.

Make sure that the modeler is required for the bug. Most likely it is not.

atapley · 2022-12-07T21:52:22Z

Got an MVP working and surprisingly am able to view the tensorboard metrics in the MVP but not the full codebase. Guess that means it's something with the full codebase that's causing the issue - this helps me narrow it down at least! I'll keep adding until it breaks.

qgallouedec · 2022-12-08T09:08:06Z

I advise you to do the opposite: keep removing until the problem disappears. Otherwise you won't get a minimal example.

atapley · 2022-12-08T16:24:58Z

Was able to get it working! Switched up the logger, added a Monitor to the vec_env, and added some callbacks. Not sure which was the winner, but the files have logging data in them now. Seems like it wasn't an issue with the environment or stable-baselines, just some things were missing. Thanks for helping out with this! Closing the issue.

qgallouedec · 2022-12-08T16:41:10Z

Was able to get it working!

That's good to hear.

Not sure which was the winner,

Unfortunately, no one else will benefit from it. If you ever figure out what was missing, please post it here.

atapley · 2022-12-08T17:46:47Z

Looks like it came down to the EvalCallback - removing the EvalCallback results in a tensorboard file with no data within it. Looks like nothing is getting logged within the tensorboard file unless the callback gets called.

atapley added custom gym env Issue related to Custom Gym Env question Further information is requested labels Dec 7, 2022

This comment was marked as off-topic.

Sign in to view

atapley closed this as completed Dec 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensorboard files not saving when using SubprocVecEnv #1205

Tensorboard files not saving when using SubprocVecEnv #1205

atapley commented Dec 7, 2022 •

edited

Loading

This comment was marked as off-topic.

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022 •

edited

Loading

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022

qgallouedec commented Dec 7, 2022 •

edited

Loading

atapley commented Dec 7, 2022

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022

qgallouedec commented Dec 8, 2022

atapley commented Dec 8, 2022

qgallouedec commented Dec 8, 2022 •

edited

Loading

atapley commented Dec 8, 2022

Tensorboard files not saving when using SubprocVecEnv #1205

Tensorboard files not saving when using SubprocVecEnv #1205

Comments

atapley commented Dec 7, 2022 • edited Loading

🐛 Bug

Code example

Relevant log output / Error message

System Info

Checklist

This comment was marked as off-topic.

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022 • edited Loading

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022

qgallouedec commented Dec 7, 2022 • edited Loading

atapley commented Dec 7, 2022

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022

qgallouedec commented Dec 7, 2022

atapley commented Dec 7, 2022

qgallouedec commented Dec 8, 2022

atapley commented Dec 8, 2022

qgallouedec commented Dec 8, 2022 • edited Loading

atapley commented Dec 8, 2022

atapley commented Dec 7, 2022 •

edited

Loading

atapley commented Dec 7, 2022 •

edited

Loading

qgallouedec commented Dec 7, 2022 •

edited

Loading

qgallouedec commented Dec 8, 2022 •

edited

Loading