Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set a connect_timeout for the creation of a socket #243

Merged
merged 2 commits into from
Jan 5, 2016

Conversation

astratto
Copy link
Contributor

Currently when an LDAP server is not reachable (for instance, packets are dropped), users of this library end up waiting for a quite long period of time. This is very rarely acceptable and can easily lead to a chain of failures with several services timing out before the socket creation actually fails.

This patch uses Socket.tcp instead of TCPSocket.new and provides a default connect_timeout of 1 second.

Notes:

  • I couldn't find a way in the integration tests to write a proper test for this. As far as I can tell, there are no tests at that level to validate connection failures, etc... Am I right?
  • I set the timeout to 1 sec, but probably a better way could be setting it to a higher value and provide an easier way to customise it?

@jch
Copy link
Member

jch commented Dec 17, 2015

@astratto interesting. It looks like Socket.tcp gained some options in 2.x that weren't available in 1.9.x. Since we've dropped support for 1.9, this is now possible.

As far as I can tell, there are no tests at that level to validate connection failures, etc... Am I right?

Yes. I don't know how to simulate a hanging connection. If you have ideas, I'd love to hear them.

I set the timeout to 1 sec, but probably a better way could be setting it to a higher value and provide an easier way to customise it?

Agree this should be parameterized. :connect_timeout seems like a good option name. A unit test we could write would be to verify that Socket.tcp is called with the parameter.

Thanks for opening this pull!

@astratto
Copy link
Contributor Author

As far as I can tell, there are no tests at that level to validate connection failures, etc... Am I right?
Yes. I don't know how to simulate a hanging connection. If you have ideas, I'd love to hear them.

I actually already have tests for our projects to trigger that situation (hence the PR 😄), I'm just not sure how to do that with the integration tests here.

Basic idea is to use a firewall to drop packets.
A simple iptables -A OUTPUT -j DROP -p tcp --dport $LDAP_PORT would do the trick in this case.

I set the timeout to 1 sec, but probably a better way could be setting it to a higher value and provide an easier way to customise it?

Agree this should be parameterized. :connect_timeout seems like a good option name. A unit test we could write would be to verify that Socket.tcp is called with the parameter.

Yes, I just wanted to understand your point of view on the general PR.
I'll try to work on that tomorrow, but feel free to jump on it if you have time.

Thanks for opening this pull!

Thanks for this gem 😉

@jch
Copy link
Member

jch commented Dec 17, 2015

The entry point for integration tests on travis is in https://github.com/ruby-ldap/ruby-net-ldap/blob/master/.travis.yml#L15-L17. This then runs script/install-openldap. I'm not sure whether travis supports iptables, but we could run your rule for the duration of one test and then clean up after that.

@astratto astratto force-pushed the connect_timeout branch 2 times, most recently from 01b8309 to a447c11 Compare December 18, 2015 16:20
@astratto
Copy link
Contributor Author

Hey @jch, I've just pushed a patch to expose :connect_timeout and :timeout.

I think it's worth trying to add some integration tests for connection errors in general, should we address that in another patchset?

@@ -31,10 +36,15 @@ def open_connection(server)
hosts = server[:hosts]
encryption = server[:encryption]

socket_opts = {
connect_timeout: server[:connect_timeout] || DefaultConnectTimeout,
timeout: server[:timeout] || DefaultTimeout
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the doc, Socket.tcp deals with only connect_timeout. Does timeout make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, it was there in 2.0.0... http://docs.ruby-lang.org/en/2.0.0/Socket.html#method-c-tcp
I'm gonna remove it 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gone 😉

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well-done! 👍

@jch
Copy link
Member

jch commented Dec 18, 2015

I think it's worth trying to add some integration tests for connection errors in general, should we address that in another patchset?

👍 that's be amazing!

@@ -3,6 +3,9 @@
class Net::LDAP::Connection #:nodoc:
include Net::LDAP::Instrumentation

# Seconds before failing for socket connect timeout
DefaultConnectTimeout = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this value a bit more generous so it doesn't surprise any users with slow connections?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it's 5 seconds now.
I doubt more than that would be of any use, but let me know if you want to bump it further.

@astratto
Copy link
Contributor Author

@jch I'd like to decouple this PR with the integration tests.
I've just created #244 to track that.

@astratto
Copy link
Contributor Author

Okay, it was actually straightforward given that the box is used by Travis has iptables available.
There's one integration test for this now.

@jch
Copy link
Member

jch commented Jan 4, 2016

@astratto looks like there's a travis error. Could we re-run it somehow?

@satoryu
Copy link
Collaborator

satoryu commented Jan 4, 2016

@astratto this failure seems like spotify/rspec-dns#26 .

so will be fixed by adding the following lines to .travis.yml.

before_install:
  - gem update bundler

@satoryu
Copy link
Collaborator

satoryu commented Jan 5, 2016

@jch @astratto I sent another PR #245 to fix this failure. please have a look :)

This patch prevents LDAP connections to hang up for an eccessive amount of time
and instead returns earlier in case of failures (e.g., packets dropped).

A new option is now exposed through Net::LDAP:
- connect_timeout: sets a timeout for socket#connect (defaults to 1s)

It also provides an integration test to validate the new behaviour (ruby-ldap#244)
@astratto
Copy link
Contributor Author

astratto commented Jan 5, 2016

@jch I've just rebased and now it's all green.

Thanks @satoryu

@@ -109,4 +109,7 @@ chgrp ssl-cert /etc/ssl/private/ldap01_slapd_key.pem
chmod g+r /etc/ssl/private/ldap01_slapd_key.pem
chmod o-r /etc/ssl/private/ldap01_slapd_key.pem

# Drop packets on a secondary port used to specific timeout tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 helpful comment

jch added a commit that referenced this pull request Jan 5, 2016
Set a connect_timeout for the creation of a socket
@jch jch merged commit 7b692fc into ruby-ldap:master Jan 5, 2016
@jch
Copy link
Member

jch commented Jan 5, 2016

Looks great. I especially appreciate you taking the time to add an integration test. 🎉

@astratto astratto deleted the connect_timeout branch January 5, 2016 18:26
This was referenced Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants