Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to pass ssh socket heartbeat interval #193

Merged
merged 3 commits into from
Jul 20, 2019
Merged

Add ability to pass ssh socket heartbeat interval #193

merged 3 commits into from
Jul 20, 2019

Conversation

dstokes
Copy link
Contributor

@dstokes dstokes commented Jul 18, 2019

When running multiple concurrent deployments over SSH with ruby threads, a race
condition can occur where a docker operation reads all data from the
forwarded socket before net/ssh can check for readability on the socket
with IO.select. Setting an ssh.loop wait ensures that the Docker
operation block passed to ssh.loop is evaluated at least once per
wait.

@dstokes
Copy link
Contributor Author

dstokes commented Jul 19, 2019

The build failure for this PR is addressed in #194

When running multiple concurrent deployments over SSH with ruby threads, a race
condition can occur where a docker operation reads all data from the
forwarded socket before net/ssh can check for readability on the socket
with IO.select. Setting an ssh.loop `wait` ensures that the Docker
operation block passed to ssh.loop is evaluated at least once per
`wait`.

Co-authored-by: Christina Guida <cguida@newrelic.com>
Co-authored-by: Rob Day-Reynolds <rdayreynolds@newrelic.com>
Co-authored-by: Stephen Weber <sweber@newrelic.com>
@intjonathan
Copy link
Contributor

intjonathan commented Jul 19, 2019

a race condition can occur

If this is the case, why default to nil for the ssh_socket_heartbeat value?

What heartbeat value would set after this is merged for your deploy scripts? Can we give guidance as to when to set this parameter and to what value?

@dstokes
Copy link
Contributor Author

dstokes commented Jul 19, 2019

@intjonathan Great point. The condition we're seeing may be specific to the way Centurion is being used within concurrent-ruby thread pools. We're defaulting to nil for backwards compatibility, but we could certainly set a sane default like 30 seconds. We'll be using something closer to 5 seconds for our implementation.

@intjonathan
Copy link
Contributor

Much better :)

@intjonathan intjonathan merged commit f5f9812 into newrelic:master Jul 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants