Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]? AEC sequence generates unexpected behavior #713

Closed
doesburg11 opened this issue May 31, 2022 · 7 comments
Closed

[Bug]? AEC sequence generates unexpected behavior #713

doesburg11 opened this issue May 31, 2022 · 7 comments

Comments

@doesburg11
Copy link

Description bug
What is going right:
In a PettingZoo so called Agent Environment Cycle, when an agents sets itself to "done" in "step()", this agent is rightly removed in the next iteration and the cycle resumes as expected to the next agent in "agents".

What is going wrong:
However the next agents after this next iteration gets unexpectedly skipped during this cycle.

Is there a way out to orderly resume the cycle? Is this a bug or is this a coding error of myself?

NB: this skipping only occurs in the same cycle after the removal of the done agent. In subsequent later cycles the sequences are ok.

Code example
https://github.com/doesburg11/pz-pred-prey/blob/master/issue/main.py

Output of example
iteration 0: agent: agent_0
iteration 1: agent: agent_1
iteration 2: agent: agent_2
iteration 3: agent: agent_3
iteration 4: agent: agent_4
iteration 5: agent: agent_5
iteration 6: agent: agent_0
iteration 7: agent: agent_1
iteration 8: agent: agent_2
agent_2 HAS NO ENERGY LEFT AND IS DONE
iteration 9: agent: agent_2 ### 2 gets removed in "step()"
iteration 10: agent: agent_3 ### resumption sequence ok
iteration 11: agent: agent_5 ### agent 4 gets unexpectedly skipped!
iteration 12: agent: agent_0
iteration 13: agent: agent_1
iteration 14: agent: agent_3
iteration 15: agent: agent_4 ### agent 4 is back in the cycle!
iteration 16: agent: agent_5
iteration 17: agent: agent_0
iteration 18: agent: agent_1
iteration 19: agent: agent_3
iteration 20: agent: agent_4
iteration 21: agent: agent_5

So in effect I am trying to fix iteration 11, where agent 4 would be "served" instead of being skipped.

System Info

  • Conda environment.
  • Linux Mint 20.3 Cinnamon
  • Python version 3.10
  • IDE: DataSpell 2022.1

Additional context
Implementing a multi-agent predator-prey PettingZoo AEC environment, with agents being "done" through either themselves or through other agents (the latter works fine). My guess it has something to do with the "_current_agent" counter (in the "agent_selector class" in ../pettingzoo/utils/agent_selector.py) in combination with the recommended first code in "step()", being:

        if self.dones[self.agent_selection]:
            self._was_done_step(action)
            return

But I have not been able to figure this out unfortunately. I have checked for similar issues in https://github.com/Farama-Foundation/PettingZoo/issues but to no avail.

@doesburg11 doesburg11 changed the title [Bug] AEC sequence generates unexpected behavior [Bug?] AEC sequence generates unexpected behavior May 31, 2022
@doesburg11 doesburg11 changed the title [Bug?] AEC sequence generates unexpected behavior [Bug]? AEC sequence generates unexpected behavior May 31, 2022
@jjshoots
Copy link
Member

jjshoots commented Jun 1, 2022

Hi @doesburg11, thanks for the issue report. Handling AEC cycles is actually a fairly tricky thing, and it's easy to miss the little details.

In your case, you may have missed a reinit function, some examples of which can be seen here and here. Without this line, some fairly undocumented behaviour can occur. I wouldn't blame you for this mistake though, many environments actually end when any agent is done so this detail is rarely discussed.

@doesburg11
Copy link
Author

Thank you @jjshoots for your suggestion. However implementing the reinit function did not solve my problem unfortunately. If I understand it right reinit restores the AEC sequence at the end of the cycle but not somewhere in the middle of the cycle.

As per the example I used, during the cycle the second next agent gets still wrongfully skipped during the current cycle using reinit and if_last(). Of course after this cycle everything works fine but that was already the case.

I've dug l little deeper in this issue and it seems to me that the agent_selector.next() does not resume the current cycle properly if the current agent sets another agent to done=True AND if that agent already has been been iterated over in the current cycle_. In that case the agent_selection is not calibrated properly anymore with` _current_selection, in my understanding

In other words: if the agents_selection has current_position N in the agent_order. The current cycle is not iterating properly over the rest of this cycle (it unwantedly skips the second next agent in the iteration) IF the agent being done, caused by agent N while this done agent has a position of 1,..., N+1 in agent_order. Al other positions (N+2...etc) are working fine by the way. Also new future cycles are working fine in any case. So maybe a small temporary anomaly in the full iteration, but it nevertheless can result in annoying unexpected results in my view.

My temporary workaround for this unwanted skips of agents in the AEC at the moment is to add the following at the start of step():

        if self.dones[self.agent_selection]:
            self._was_done_step(action)
            self._agent_selector._current_agent = self.agents.index(self._agent_selector.selected_agent) + 1
            return

The added line after self._was_done_step(action) re-calibrates the _current_agent properly to the new agent_order if an agent with position N causes "doing" agents in the range 1,...N+1. I admit this is a blunt solution, especially the index function is not very efficient, but it works for me for now and as far as I'm concerned. I guess my issue is done, but if you do not agree i'm very happy to discuss further Thank you again!

@jjshoots
Copy link
Member

jjshoots commented Jun 2, 2022

@benblack769 @RedTachyon Is this something that we should look into? Or is this intended behaviour. I suppose the intended documented use case is for dones to only be set at the end of the cycle, but not being able to set dones during a cycle and causing weird looping issues seems concerning to me.

@benblack769
Copy link
Contributor

AEC is tricky when removing agents. I don't think we've run across any cases yet like yours, where it is not cyclic, and you want agents to be removed.

Here is a slightly more efficient solution than yours:

        def step(self, action):
            if self.dones[self.agent_selection]:
                self._was_done_step(action)
                self.agent_selection = self._agent_selector.next()
                return
            agent_name = self.agent_selection
            agent_instance = self.instance(agent_name)
            agent_instance.energy_level -= 1
            if agent_instance.energy_level <= 0:
                self.dones[agent_name] = True
                print(agent_name + " HAS NO ENERGY LEFT AND IS DONE")
            else:
                self.agent_selection = self._agent_selector.next()
            self.agent_selection = self._dones_step_first()

And if you want something that the parallel env can understand, please take a look at this really confusing logic in KAZ https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/butterfly/knights_archers_zombies/knights_archers_zombies.py#L554

That is needed to put the terminal step of the agent in the correct place in the cycle for parallel evns.

@doesburg11
Copy link
Author

Thanks very much for your suggestions and willingness to think along with my apparently unusual case. Much appreciated. I do now realize the trickiness of removing agents in AEC.

As far as I have been able to find out, the sequence of living agents can give rise to unexpected behavior in the following cases of "done" and subsequent removal:

1) the selected agent causes another agent to "done" which has already been selected earlier in the agent_order of the same cycle,
2) the selected agent causes itself to "done" (see earlier example)
3) the selected agents causes its direct successor in the agent_order to "done".

I'm not sure if this covers all anomalies but I have a strong feeling it does, after a lot of experimentation.

FYI: I have came up with (in the words of the KAZ author) a ‘wacky’ fix which at least handles the three aforementioned anomalies. (I must admit that a little trial and error was involved, and the fix lacks any elegance...). But at least it works as far as I know.

    def step(self, action):
        if self.dones[self.agent_selection]:
            # to fix a runtime error when self._skip_agent_selection is already being "done"
            # and deleted by the selected agent: see 3)
            if self.agent_selection == self._skip_agent_selection:
                self._skip_agent_selection = self._agent_selector.next()
                return
            self._was_done_step(action)
            # to fix changing array indexes when the selected agent
            # deletes an earlier agent in the agent order (agents): see 1)
            # and to fix when the selected agent deltes itself: see 2)
            if self.agents[self._agent_selector._current_agent - 1] != self.agent_selection:
                self._agent_selector._current_agent -= 1
            return
        # main contents of step

NB: the suggested adaptation of @benblack769 did bring the missing agent_4 back in my example, but unfortunately skipped the next agent_3 in the cycle after agent_2 was being done and deleted:

....
iteration 7: agent: agent_1
iteration 8: agent: agent_2
agent_2 HAS NO ENERGY LEFT AND IS DONE
iteration 9: agent: agent_2
# unexpectedly skips agent 3 in the current cycle
iteration 10: agent: agent_4
....

@benblack769
Copy link
Contributor

Sorry that I missed that it skipped agent 3 in my solution.

As for your solution, I would watch out whenever modifying self._agent_selector._current_agent and I would make sure to wrap it with (self._agent_selector._current_agent-1)%len(self._agent_selector.agent_order)

Here is a fixed solution that does not seem to skip agent 3.

        def reset(self, seed=None):
            for _ in range(6):
                self.create_agent()
            self._agent_selector = agent_selector(list(self.agents))
            self.agent_selection = self._agent_selector.next()

        def step(self, action):
            if self.dones[self.agent_selection]:
                self._was_done_step(action)
                return
            agent_name = self.agent_selection
            agent_instance = self.instance(agent_name)
            agent_instance.energy_level -= 1
            if agent_instance.energy_level <= 0:
                self.dones[agent_name] = True
                print(agent_name + " HAS NO ENERGY LEFT AND IS DONE")
                self._agent_selector.agent_order.remove(agent_name)
                self._agent_selector._current_agent -= 1
                self._agent_selector._current_agent %= len(self._agent_selector.agent_order)

            self.agent_selection = self._agent_selector.next()
            self._dones_step_first()

Note that I think that it could get a lot simpler.

@doesburg11
Copy link
Author

Thanks again @benblack769. Your solution works and iterates the cycle in the right order! Thanks for your warning, it is most likely not wise to change self._agent_selector_current_agent outside of the agent_selector class as a protected member at all. But I will keep on searching for a more elegant solution.

@jjshoots jjshoots closed this as completed Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants