Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Check if missed heartbeat synchronously before sending message #5134

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

tjenkinson
Copy link

@tjenkinson tjenkinson commented Jul 15, 2024

Chrome (and maybe other browsers) will throttle setTimeout under certain conditions, causing them to slow down, and this breaks the heartbeat detection logic.

This fix means if a message is emitted when the connection should be considered expired (but the setTimeout hasn't fired yet), we close that socket and then buffer the message for the new connection. So now a message should never be sent over an expired connection.

This doesn't solve the problem of connections being detected as expired too late. I think that could be done separately.

The kind of change this PR does introduce

  • a bug fix
  • a new feature
  • an update to the documentation
  • a code change that improves performance
  • other

Current behavior

Check the repro on #5135

New behavior

  • When the next message is sent, the system will realise it's missed the heartbeat, create a new connection, and then the message will be sent over the new connection
  • So now a message will never be sent over a stale connection, and we're not depending on the heartbeat timeout to fire correctly when a page comes back to life (which doesn't happen when it's throttled)

Other information (e.g. related issues)

The reason that the heartbeat logic is not immediately closing the old connection when the page is woken up is due to chrome pausing the setTimeout that's used in the heartbeat logic. This PR doesn't fix that. I Think that could be handled separately.

#3507 (comment) is another fix related to throttled timers

Let me know if you think this makes sense and I'm happy to look into adding a test for it

Chrome (and maybe other browsers) will throttle `setTimeout` under certain conditions, causing them to slow down, and this breaks the heartbeat detection logic.

This fix means if a message is emitted when the connection should be considered expired (but the `setTimeout` hasn't fired yet), we close that socket and then buffer the message for the new connection. So now a message should never be sent over an expired connection.

This doesn't solve the problem of connections being detected as expired too late. I think that could be done separately.
@@ -440,6 +440,9 @@ export class Socket<
packet.id = id;
}

// check synchronously if we've missed a heartbeat, potentially causing a disconnection
this.io.checkHeartbeat()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about merging the checkHeartbeat() method with the isTransportWritable check below? Something like:

packages/engine.io-client/lib/socket.ts

get writable() {
  const expired = Date.now() >= this._pingTimeoutTime
  if (expired) {
    debug("`checkHeartbeat` detected missed ping so closing connection");
    this._onClose("ping timeout");
    return false;
  }

  return this.io.engine.transport?.writable;
}

packages/socket.io-client/lib/socket.ts

const isTransportWritable = this.io.engine.writable;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a bit strange that reading writable can have a side effect? Especially when it can cause events to be emitted as part of that

Maybe writable should return false when a heartbeat has been missed, but we should still have a function like checkHeartbeat (or maybe disconnectIfHeartbeatMissed()) that does the disconnection.

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the checkHeartbeat() method, it feels like we are mixing two different layers. In my opinion, the Socket.IO client should not have to be aware of the internals of the Engine.IO client (up to a certain point, obviously).

That's why I suggested to merge the heartbeat check with the transport check below. Maybe not as a getter with side effects, but how about a method like socket.isReady()?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I liked the writable idea but I think it was the synchronous disconnection as part of reading that that felt a bit strange.

How about:

  • adding isAlive/isResponsive getter which will check the last heartbeat timestamp, and if it’s false also schedule a disconnect on the next tick
  • Maybe update writable too to also check the last heartbeat time and schedule a disconnection
  • Instead of this.connected check the isAlive instead. (this.connected doesn’t work if the disconnect gets triggered on the next tick)

I think it feels better doing the disconnection on the next tick instead of sync as part of the read call, as then it’s the same pattern as if the heartbeat timer task fired anyway and there can’t be event listeners being called as part of the read

WDYT?


Or

We don't add isResponsive, and make writable do the check and queue the disconnect, but update the .connected check below to instead check .writable, so we always buffer if it's not writable. Not sure what impact this might have?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.

adding isAlive/isResponsive getter which will check the last heartbeat timestamp, and if it’s false also schedule a disconnect on the next tick

That sounds reasonable 👍

Note: we already have a nextTick() method here.

We don't add isResponsive, and make writable do the check and queue the disconnect, but update the .connected check below to instead check .writable, so we always buffer if it's not writable. Not sure what impact this might have?

Not sure about this one, since writable is an attribute of the Transport class, while checking the heartbeat expiry sounds like the duty of the Socket class.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok great thanks will update to that first approach then

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@darrachequesne I've pushed the changes for this

@tjenkinson tjenkinson changed the title [WIP] fix: Check if missed heartbeat synchronously before sending message fix: Check if missed heartbeat synchronously before sending message Aug 9, 2024
@tjenkinson
Copy link
Author

Looking at CI maybe this needs to be 2 separate PR's?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants