[Bug]? AEC sequence generates unexpected behavior #713

doesburg11 · 2022-05-31T20:18:52Z

Description bug
What is going right:
In a PettingZoo so called Agent Environment Cycle, when an agents sets itself to "done" in "step()", this agent is rightly removed in the next iteration and the cycle resumes as expected to the next agent in "agents".

What is going wrong:
However the next agents after this next iteration gets unexpectedly skipped during this cycle.

Is there a way out to orderly resume the cycle? Is this a bug or is this a coding error of myself?

NB: this skipping only occurs in the same cycle after the removal of the done agent. In subsequent later cycles the sequences are ok.

Code example
https://github.com/doesburg11/pz-pred-prey/blob/master/issue/main.py

Output of example
iteration 0: agent: agent_0
iteration 1: agent: agent_1
iteration 2: agent: agent_2
iteration 3: agent: agent_3
iteration 4: agent: agent_4
iteration 5: agent: agent_5
iteration 6: agent: agent_0
iteration 7: agent: agent_1
iteration 8: agent: agent_2
agent_2 HAS NO ENERGY LEFT AND IS DONE
iteration 9: agent: agent_2 ### 2 gets removed in "step()"
iteration 10: agent: agent_3 ### resumption sequence ok
iteration 11: agent: agent_5 ### agent 4 gets unexpectedly skipped!
iteration 12: agent: agent_0
iteration 13: agent: agent_1
iteration 14: agent: agent_3
iteration 15: agent: agent_4 ### agent 4 is back in the cycle!
iteration 16: agent: agent_5
iteration 17: agent: agent_0
iteration 18: agent: agent_1
iteration 19: agent: agent_3
iteration 20: agent: agent_4
iteration 21: agent: agent_5

So in effect I am trying to fix iteration 11, where agent 4 would be "served" instead of being skipped.

System Info

Conda environment.
Linux Mint 20.3 Cinnamon
Python version 3.10
IDE: DataSpell 2022.1

Additional context
Implementing a multi-agent predator-prey PettingZoo AEC environment, with agents being "done" through either themselves or through other agents (the latter works fine). My guess it has something to do with the "_current_agent" counter (in the "agent_selector class" in ../pettingzoo/utils/agent_selector.py) in combination with the recommended first code in "step()", being:

        if self.dones[self.agent_selection]:
            self._was_done_step(action)
            return

But I have not been able to figure this out unfortunately. I have checked for similar issues in https://github.com/Farama-Foundation/PettingZoo/issues but to no avail.

The text was updated successfully, but these errors were encountered:

jjshoots · 2022-06-01T12:34:42Z

Hi @doesburg11, thanks for the issue report. Handling AEC cycles is actually a fairly tricky thing, and it's easy to miss the little details.

In your case, you may have missed a reinit function, some examples of which can be seen here and here. Without this line, some fairly undocumented behaviour can occur. I wouldn't blame you for this mistake though, many environments actually end when any agent is done so this detail is rarely discussed.

doesburg11 · 2022-06-02T12:24:49Z

Thank you @jjshoots for your suggestion. However implementing the reinit function did not solve my problem unfortunately. If I understand it right reinit restores the AEC sequence at the end of the cycle but not somewhere in the middle of the cycle.

As per the example I used, during the cycle the second next agent gets still wrongfully skipped during the current cycle using reinit and if_last(). Of course after this cycle everything works fine but that was already the case.

I've dug l little deeper in this issue and it seems to me that the agent_selector.next() does not resume the current cycle properly if the current agent sets another agent to done=True AND if that agent already has been been iterated over in the current cycle_. In that case the agent_selection is not calibrated properly anymore with` _current_selection, in my understanding

In other words: if the agents_selection has current_position N in the agent_order. The current cycle is not iterating properly over the rest of this cycle (it unwantedly skips the second next agent in the iteration) IF the agent being done, caused by agent N while this done agent has a position of 1,..., N+1 in agent_order. Al other positions (N+2...etc) are working fine by the way. Also new future cycles are working fine in any case. So maybe a small temporary anomaly in the full iteration, but it nevertheless can result in annoying unexpected results in my view.

My temporary workaround for this unwanted skips of agents in the AEC at the moment is to add the following at the start of step():

        if self.dones[self.agent_selection]:
            self._was_done_step(action)
            self._agent_selector._current_agent = self.agents.index(self._agent_selector.selected_agent) + 1
            return

The added line after self._was_done_step(action) re-calibrates the _current_agent properly to the new agent_order if an agent with position N causes "doing" agents in the range 1,...N+1. I admit this is a blunt solution, especially the index function is not very efficient, but it works for me for now and as far as I'm concerned. I guess my issue is done, but if you do not agree i'm very happy to discuss further Thank you again!

jjshoots · 2022-06-02T14:10:01Z

@benblack769 @RedTachyon Is this something that we should look into? Or is this intended behaviour. I suppose the intended documented use case is for dones to only be set at the end of the cycle, but not being able to set dones during a cycle and causing weird looping issues seems concerning to me.

benblack769 · 2022-06-04T13:25:36Z

AEC is tricky when removing agents. I don't think we've run across any cases yet like yours, where it is not cyclic, and you want agents to be removed.

Here is a slightly more efficient solution than yours:

        def step(self, action):
            if self.dones[self.agent_selection]:
                self._was_done_step(action)
                self.agent_selection = self._agent_selector.next()
                return
            agent_name = self.agent_selection
            agent_instance = self.instance(agent_name)
            agent_instance.energy_level -= 1
            if agent_instance.energy_level <= 0:
                self.dones[agent_name] = True
                print(agent_name + " HAS NO ENERGY LEFT AND IS DONE")
            else:
                self.agent_selection = self._agent_selector.next()
            self.agent_selection = self._dones_step_first()

And if you want something that the parallel env can understand, please take a look at this really confusing logic in KAZ https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/butterfly/knights_archers_zombies/knights_archers_zombies.py#L554

That is needed to put the terminal step of the agent in the correct place in the cycle for parallel evns.

doesburg11 · 2022-06-05T21:00:20Z

Thanks very much for your suggestions and willingness to think along with my apparently unusual case. Much appreciated. I do now realize the trickiness of removing agents in AEC.

As far as I have been able to find out, the sequence of living agents can give rise to unexpected behavior in the following cases of "done" and subsequent removal:

1) the selected agent causes another agent to "done" which has already been selected earlier in the agent_order of the same cycle,
2) the selected agent causes itself to "done" (see earlier example)
3) the selected agents causes its direct successor in the agent_order to "done".

I'm not sure if this covers all anomalies but I have a strong feeling it does, after a lot of experimentation.

FYI: I have came up with (in the words of the KAZ author) a ‘wacky’ fix which at least handles the three aforementioned anomalies. (I must admit that a little trial and error was involved, and the fix lacks any elegance...). But at least it works as far as I know.

    def step(self, action):
        if self.dones[self.agent_selection]:
            # to fix a runtime error when self._skip_agent_selection is already being "done"
            # and deleted by the selected agent: see 3)
            if self.agent_selection == self._skip_agent_selection:
                self._skip_agent_selection = self._agent_selector.next()
                return
            self._was_done_step(action)
            # to fix changing array indexes when the selected agent
            # deletes an earlier agent in the agent order (agents): see 1)
            # and to fix when the selected agent deltes itself: see 2)
            if self.agents[self._agent_selector._current_agent - 1] != self.agent_selection:
                self._agent_selector._current_agent -= 1
            return
        # main contents of step

NB: the suggested adaptation of @benblack769 did bring the missing agent_4 back in my example, but unfortunately skipped the next agent_3 in the cycle after agent_2 was being done and deleted:

....
iteration 7: agent: agent_1
iteration 8: agent: agent_2
agent_2 HAS NO ENERGY LEFT AND IS DONE
iteration 9: agent: agent_2
# unexpectedly skips agent 3 in the current cycle
iteration 10: agent: agent_4
....

benblack769 · 2022-06-05T22:17:09Z

Sorry that I missed that it skipped agent 3 in my solution.

As for your solution, I would watch out whenever modifying self._agent_selector._current_agent and I would make sure to wrap it with (self._agent_selector._current_agent-1)%len(self._agent_selector.agent_order)

Here is a fixed solution that does not seem to skip agent 3.

        def reset(self, seed=None):
            for _ in range(6):
                self.create_agent()
            self._agent_selector = agent_selector(list(self.agents))
            self.agent_selection = self._agent_selector.next()

        def step(self, action):
            if self.dones[self.agent_selection]:
                self._was_done_step(action)
                return
            agent_name = self.agent_selection
            agent_instance = self.instance(agent_name)
            agent_instance.energy_level -= 1
            if agent_instance.energy_level <= 0:
                self.dones[agent_name] = True
                print(agent_name + " HAS NO ENERGY LEFT AND IS DONE")
                self._agent_selector.agent_order.remove(agent_name)
                self._agent_selector._current_agent -= 1
                self._agent_selector._current_agent %= len(self._agent_selector.agent_order)

            self.agent_selection = self._agent_selector.next()
            self._dones_step_first()

Note that I think that it could get a lot simpler.

doesburg11 · 2022-06-06T21:05:20Z

Thanks again @benblack769. Your solution works and iterates the cycle in the right order! Thanks for your warning, it is most likely not wise to change self._agent_selector_current_agent outside of the agent_selector class as a protected member at all. But I will keep on searching for a more elegant solution.

doesburg11 changed the title ~~[Bug] AEC sequence generates unexpected behavior~~ [Bug?] AEC sequence generates unexpected behavior May 31, 2022

doesburg11 changed the title ~~[Bug?] AEC sequence generates unexpected behavior~~ [Bug]? AEC sequence generates unexpected behavior May 31, 2022

doesburg11 mentioned this issue May 31, 2022

[Bug]? AEC sequence generates unexpected behavior #713 Open doesburg11/pz-pred-prey#1

Open

jjshoots closed this as completed Jun 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]? AEC sequence generates unexpected behavior #713

[Bug]? AEC sequence generates unexpected behavior #713

doesburg11 commented May 31, 2022

jjshoots commented Jun 1, 2022

doesburg11 commented Jun 2, 2022

jjshoots commented Jun 2, 2022

benblack769 commented Jun 4, 2022

doesburg11 commented Jun 5, 2022

benblack769 commented Jun 5, 2022

doesburg11 commented Jun 6, 2022

[Bug]? AEC sequence generates unexpected behavior #713

[Bug]? AEC sequence generates unexpected behavior #713

Comments

doesburg11 commented May 31, 2022

jjshoots commented Jun 1, 2022

doesburg11 commented Jun 2, 2022

jjshoots commented Jun 2, 2022

benblack769 commented Jun 4, 2022

doesburg11 commented Jun 5, 2022

benblack769 commented Jun 5, 2022

doesburg11 commented Jun 6, 2022