Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fleet status completely out-of-sync with GameServerSet status #570

Closed
jkowalski opened this issue Feb 7, 2019 · 9 comments
Closed

Fleet status completely out-of-sync with GameServerSet status #570

jkowalski opened this issue Feb 7, 2019 · 9 comments
Labels
good first issue These are great first issues. If you are looking for a place to start, start here! help wanted We would love help on these issues. Please come help us! kind/bug These are bugs.
Milestone

Comments

@jkowalski
Copy link
Contributor

jkowalski commented Feb 7, 2019

This is when running a test that scales fleets up and down. The numbers observed by fleet should be roughly in sync with corresponding GSS, but they are not:

$ kubectl get fleet,gss

NAME                                          SCHEDULING   DESIRED   CURRENT   ALLOCATED   READY   AGE
fleet.stable.agones.dev/scale-fleet-0-pw776   Packed       0         54        0           54      3m
fleet.stable.agones.dev/scale-fleet-1-dsdm9   Packed       0         200       0           200     3m
fleet.stable.agones.dev/scale-fleet-2-skhfw   Packed       0         0         0           0       3m
fleet.stable.agones.dev/scale-fleet-3-ggb2w   Packed       0         200       0           200     3m
fleet.stable.agones.dev/scale-fleet-4-v2v86   Packed       0         47        0           47      3m
fleet.stable.agones.dev/scale-fleet-5-hfshp   Packed       0         200       0           200     3m
fleet.stable.agones.dev/scale-fleet-6-vwqzp   Packed       200       0         0           0       3m
fleet.stable.agones.dev/scale-fleet-7-2vtzh   Packed       0         200       0           200     3m
fleet.stable.agones.dev/scale-fleet-8-b8kgl   Packed       0         99        0           99      2m
fleet.stable.agones.dev/scale-fleet-9-tlnss   Packed       0         200       0           200     2m

NAME                                                        SCHEDULING   DESIRED   CURRENT   ALLOCATED   READY   AGE
gameserverset.stable.agones.dev/scale-fleet-0-pw776-v5h2z   Packed       0         0         0           0       3m
gameserverset.stable.agones.dev/scale-fleet-1-dsdm9-d7vdn   Packed       200       200       0           200     3m
gameserverset.stable.agones.dev/scale-fleet-2-skhfw-krq98   Packed       0         0         0           0       3m
gameserverset.stable.agones.dev/scale-fleet-3-ggb2w-m8s8w   Packed       0         189       0           189     3m
gameserverset.stable.agones.dev/scale-fleet-4-v2v86-2c84s   Packed       0         0         0           0       3m
gameserverset.stable.agones.dev/scale-fleet-5-hfshp-4ztcw   Packed       200       200       0           200     3m
gameserverset.stable.agones.dev/scale-fleet-6-vwqzp-8hb6s   Packed       0         0         0           0       3m
gameserverset.stable.agones.dev/scale-fleet-7-2vtzh-j9tl5   Packed       200       200       0           200     2m
gameserverset.stable.agones.dev/scale-fleet-8-b8kgl-9vcfz   Packed       0         0         0           0       2m
gameserverset.stable.agones.dev/scale-fleet-9-tlnss-fkrqm   Packed       0         35        0           35      2m

Simple repro is in PR #571

@jkowalski jkowalski added kind/bug These are bugs. help wanted We would love help on these issues. Please come help us! good first issue These are great first issues. If you are looking for a place to start, start here! labels Feb 7, 2019
@aLekSer
Copy link
Collaborator

aLekSer commented Feb 18, 2019

What I have noticed after running stress-test-e2e with stress factor of 2 :

fleet.stable.agones.dev/scale-fleet-1-mfd8s   Packed       20        18        0           0         3m   
gameserverset.stable.agones.dev/scale-fleet-1-mfd8s-9gl9r   Packed       0         16        0           0         3m  

After several seconds:

fleet.stable.agones.dev/scale-fleet-1-mfd8s   Packed       20        16        0           0         3m
gameserverset.stable.agones.dev/scale-fleet-1-mfd8s-9gl9r   Packed       20        36        0           0         3m 

It seems that updating Desired count for this case when number of GS > Desired we have a situation, where some servers are in shutdown state and other in creating, scheduled state:

"scale-fleet-8-bx7n4-ntxbw-x2k2t in Creating" 
"scale-fleet-8-bx7n4-ntxbw-chccr in Creating" 
"scale-fleet-8-bx7n4-ntxbw-z44zm in Shutdown" 
"scale-fleet-8-bx7n4-ntxbw-tqh4b in Scheduled" 
"scale-fleet-8-bx7n4-ntxbw-qflnc in Creating" 

It would be great if we can add the test for that particular case.

@Yingxin-Jiang
Copy link
Contributor

I will work on this issue.

@aLekSer
Copy link
Collaborator

aLekSer commented Feb 22, 2019

Hello @Yingxin-Jiang ,
I was thinking of making fleets status updates a bit faster and updating them using GameServer Status, not GameServerSet Status. Change in updateFleetStatus() from ListGameServerSetsByFleetOwner() to ListGameServersByFleetOwner().
Currently we update GS status -> GSS -> then Fleet.

aLekSer@1c3f992
But them are not 100% in sync either. I think cause of workerqueues which started in different points in time.

@Yingxin-Jiang
Copy link
Contributor

@aLekSer Thanks for the info. Looks like you are working on the issue. So I better leave it to you to avoid duplicate work :)

@aLekSer
Copy link
Collaborator

aLekSer commented Feb 26, 2019

@Yingxin-Jiang , I stopped working on it for a bit, because I was not able to find root cause and how we can sync them. So please, if you have any ideas raise your own fix for this ticket.

@Yingxin-Jiang
Copy link
Contributor

@aLekSer Thanks again for the info. Then I will give it a try.

@Yingxin-Jiang
Copy link
Contributor

I can't reproduce the issue. Probably it's fixed by commit f6daaf1

@aLekSer
Copy link
Collaborator

aLekSer commented Mar 1, 2019

Running stress-test-e2e with STRESS_TEST_LEVEL ?= 2 GSS and Fleet Statuses differs only for a small amount of time and then got synced in seconds. And amount of Current GameServers never raise above desired 20 GameServers.
get fleets gss

@markmandel
Copy link
Collaborator

Sounds like we can close this issue - and if it pops up again, reopen it?

@markmandel markmandel added this to the 0.9.0 milestone Mar 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue These are great first issues. If you are looking for a place to start, start here! help wanted We would love help on these issues. Please come help us! kind/bug These are bugs.
Projects
None yet
Development

No branches or pull requests

4 participants