Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Health check error message #2649

Merged
merged 11 commits into from
Jul 6, 2022

Conversation

XiaNi
Copy link
Contributor

@XiaNi XiaNi commented Jun 28, 2022

What type of PR is this?
/kind feature

What this PR does / Why we need it:
it turns faceless error like:

Error: write after end
    at new NodeError (node:internal/errors:371:5)
    at _write (node:internal/streams/writable:319:11)
    at ClientHttp2Stream.Writable.write (node:internal/streams/writable:334:10)
    at Http2CallStream.writeMessageToStream (/home/xiani/work/msa-xrengine/node_modules/@grpc/grpc-js/src/call-stream.ts:499:23)
    at /home/xiani/work/msa-xrengine/node_modules/@grpc/grpc-js/src/call-stream.ts:851:14

to

Error: health ping connection failure: write after end
    at /home/xiani/work/msa-xrengine/packages/instanceserver/src/start.ts:85:15
    at onwriteError (node:internal/streams/writable:415:3)
    at onwrite (node:internal/streams/writable:457:7)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)

where developer can see who was trying to write and why.

Which issue(s) this PR fixes:

Special notes for your reviewer:

if for some reason agones server connection fails - `healthStream.write` throws error that I fail to catch but at least we can show source of error
@google-cla
Copy link

google-cla bot commented Jun 28, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 61df732c-6124-4d68-bf6b-1881dce156f3

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2649/head:pr_2649 && git checkout pr_2649
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.25.0-4030481-amd64

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 1cd024d1-4c22-4dce-8f4f-6ec5010cd787

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2649/head:pr_2649 && git checkout pr_2649
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.25.0-1e701f9-amd64

@steven-supersolid
Copy link
Collaborator

Nice addition! Could you check coverage with npm run cover and see if it is now reduced from 100%? Then add a test if so. I think we have some existing tests for errors that can be used as a reference

@roberthbailey roberthbailey added the kind/cleanup Refactoring code, fixing up documentation, etc label Jun 28, 2022
@google-oss-prow google-oss-prow bot added size/M and removed size/XS labels Jun 29, 2022
@XiaNi
Copy link
Contributor Author

XiaNi commented Jun 29, 2022

@steven-supersolid recheck please, if it's ok to make health() function to be able to take optional errorCallback?
I made so to make users be able to catch stream write error if it's needed by providing errorCallback - if it's there - it will be called on error, otherwise error will be thrown as usual.
3 tests cases added to provide 100% coverage.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 79ea6279-9377-458f-995f-74c150a9908c

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2649/head:pr_2649 && git checkout pr_2649
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.25.0-1e9bb92-amd64

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 4dc8857c-51f1-48b6-815d-0314335b66ec

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Member

--- FAIL: TestGameServerAllocationDuringGameServerDeletion (0.00s)
    --- FAIL: TestGameServerAllocationDuringGameServerDeletion/scale_down (16.42s)
        fleet_test.go:947: 
            	Error Trace:	fleet_test.go:947
            	            				fleet_test.go:959
            	Error:      	Should NOT be empty, but was []
            	Test:       	TestGameServerAllocationDuringGameServerDeletion/scale_down

e2e feature gates - that's a new one, but looks unrelated to the change at hand.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 53b02beb-0ffb-4bfc-9a89-2859f87705b2

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2649/head:pr_2649 && git checkout pr_2649
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.25.0-c442152-amd64

return stream;
});
try {
agonesSDK.health(() => {});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth adding an expect in here to test/demonstrate that the error is passed to the callback intact.
Could spy on the callback and I think and check what it was called with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

fail();
}
});
it('calls the server and re throws stream write error if no callback', async () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: For consistency with other tests, perhaps we should add a newline here to separate tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

expect(error).toEqual('error');
}
});
it('do not call error callback if there was no stream error', async () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
it('do not call error callback if there was no stream error', async () => {
it('does not call error callback if there was no stream error', async () => {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (this.healthStream === undefined) {
this.healthStream = this.client.health(() => {
// Ignore error as this can't be caught
});
if (typeof this.healthStream.on === 'function') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XiaNi I'm just wondering a couple of things about this change:

  1. Do we need to listen to the event if we are already passing a callback above? Or is this just to make testing easier?
  2. If so, do we need the typeof check? Generally I would trust gRPC client types not to change unless we intentionally update the client and then would change this code too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If we won't listen to event - there will be uncatchable error thrown even with callback. So now we can control to "catch"
    error and pass it to callback or re-throw if no callback.
    In real life scenario it 99% will be called or always without or always with callback so i could add that event handler only if callback is there on first call. But i think it will add inconsistency. And then is should be mentioned in docs that first health() call will determine behaviour of it's next calls.

  2. I'v added this check to not break tests where we implement just "write" method. So in tests ".on" is usualy undefined and in real environment it will be always 'function'. So i agree that this type check looks strange. Should i add "on" to stream spyobject of all tests with health() call? (i like this option)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation. On point 2 I'd add an 'on' to the stream spy object so we can remove this check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added 'on' to tests and removed that type check

@steven-supersolid
Copy link
Collaborator

The callback is a great idea - we may have to update some documentation too though, e.g. in https://github.com/googleforgames/agones/blob/main/site/content/en/docs/Guides/Client%20SDKs/nodejs.md

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 7a1ea6cb-79eb-4e69-a8f6-e46f7779d5d7

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2649/head:pr_2649 && git checkout pr_2649
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.25.0-439a4ee-amd64

@XiaNi
Copy link
Contributor Author

XiaNi commented Jun 30, 2022

The callback is a great idea - we may have to update some documentation too though, e.g. in https://github.com/googleforgames/agones/blob/main/site/content/en/docs/Guides/Client%20SDKs/nodejs.md

documentation is updated now too

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 2a459c70-65d6-431c-8642-ea872efc47c7

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2649/head:pr_2649 && git checkout pr_2649
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.25.0-f2a032e-amd64

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 740969af-0483-42c0-ae77-dc24edc06d55

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2649/head:pr_2649 && git checkout pr_2649
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.25.0-497a039-amd64

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 4dda535d-8f76-4d34-a9bc-67216b3639c9

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2649/head:pr_2649 && git checkout pr_2649
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.25.0-4e8c00b-amd64

@markmandel markmandel merged commit 45e39b0 into googleforgames:main Jul 6, 2022
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: markmandel, steven-supersolid, XiaNi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@SaitejaTamma SaitejaTamma added this to the 1.25.0 milestone Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved kind/cleanup Refactoring code, fixing up documentation, etc lgtm size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants