-
Notifications
You must be signed in to change notification settings - Fork 397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AiGameTest > testAiGame sometimes hits "Not in a step, but trying to add event" #10673
Comments
Note: This doesn't actually fail the test. (Maybe it should, but we should probably fix the issue first, or else it will make it more flaky.) |
The "Filler event for child:" thing comes from this logic:
|
In that instance, the above log.info wrote:
|
So how can
|
Making the test run 1000 in a loop repros this pretty consistently. I added some extra tracking and it seems like we're hitting case 1 for some reason. That is, there hasn't been any calls to addToCurrent() before, so current has its default value of null. This is surprising to me since the events we get are ones that are somewhere in the middle of the game. So did we get a bunch of other ones that got lost somehow? |
So it's not the main history writer where that happens, but one of the ones on the cloned game data used for concurrent battle calcs by AIs. That one is cloned without history, so start at a "blank" state. So we probably need to adjust how that works. |
Even with the above fix, we still see this sometimes (if we modify the test to run in a loop 100 times). Some stacks:
|
With a loop of 100 executions, I got 3 such failures, here's the third trace:
|
I just ran the test 300 times and didn't see any occurrences of the above. Could this be fixed now? |
I just tried again with |
Re-opening since #11393 changed the map for this test quite a bit. But probably the issue is still there if we restore the original map. |
OK, so what happens is that while we're doing ServerGame$1.stepChanged() which does a startNextStep() via the history writer, there's a battle calculator running on another thread which clones the game:
The cloning operation takes a write lock and clears history and then restores it. The problem is stepChange() doesn't take a lock. So it's a race condition. So, we can take add locking to the stepChanged() code. Or we can actually just hold a reference to the history object, since the saving operation actually creates a different history object on the game. |
When running on the bots, sometimes AiGameTest > testAiGame sometimes hits "Not in a step, but trying to add event".
Example:
https://github.com/triplea-game/triplea/runs/6910149568?check_suite_focus=true
The text was updated successfully, but these errors were encountered: