Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connect_timeout and inactivity_timeout not working with eventmachine 1.0.1 #222

Closed
kenn opened this issue Mar 8, 2013 · 11 comments
Closed

Comments

@kenn
Copy link

kenn commented Mar 8, 2013

Here's the source to reproduce the problem:

# gem 'eventmachine', '1.0.0'
require 'em-http-request'

puts EM::VERSION

EM.run do
  start_at = Time.now
  url = 'http://bot.lingr.com/delay/6'
  http = EventMachine::HttpRequest.new(url, :connect_timeout => 2, :inactivity_timeout => 2).head
  http.callback do
    puts "success! elapsed #{Time.now - start_at}"
    EM.stop
  end
  http.errback do
    puts "error! elapsed #{Time.now - start_at}"
    EM.stop
  end
end

The server returns a response in 6 seconds, while the timeout values are set an earlier point.

When I run this, I get

1.0.1
success! elapsed 6.280314

As you can see, it's not timing out and wait (indefinitely) until we get the response.

However, when I activate the gem pinning to 1.0.0, then run

1.0.0
error! elapsed 4.003471

So it times out after 4 seconds as expected.

Probably it's a regression in EventMachine 1.0.1 but I wasn't able to find out what change is causing this. Any advice on fix / workaround?

@igrigorik
Copy link
Owner

Nope, needs to fixed in EM. /cc @tmm1 ... Short of adding extra timers to handle this yourself, this is not something em-http should address.

@tmm1
Copy link
Contributor

tmm1 commented Mar 8, 2013

The only related commit in 1.0.1 is eventmachine/eventmachine@3c39365 (eventmachine/eventmachine#374). Can you try with and without that commit?

@kenn
Copy link
Author

kenn commented Mar 8, 2013

ok, I tried and confirmed that reverting eventmachine/eventmachine@3c39365 fixed the problem. So, it seems like a regression indeed.

@tmm1
Copy link
Contributor

tmm1 commented Mar 8, 2013

Interesting. If you use :connect_timeout => 2, :inactivity_timeout => 1, does it time out in 3 seconds?

@kenn
Copy link
Author

kenn commented Mar 8, 2013

Here's the result matrix - I find it's quite interesting indeed.

Note that I built "fixed" EM as 1.0.1.1, and used http://bot.lingr.com/delay/8 as the endpoint so that the server returns a response in 8 seconds.

connect_timeout inactivity_timeout    1.0.0     1.0.1    1.0.1.1
              1                  1    2.019553  success  2.040749
              1                  2    3.018771  success  3.003296
              1                  3    4.002432  success  4.003432
              1                  4    5.019328  success  5.017295
              2                  1    2.021929  success  2.003186
              2                  2    4.020711  success  4.003911
              2                  3    5.023549  success  5.020934
              2                  4    6.018697  success  6.271681
              3                  1    2.003149  success  2.022989
              3                  2    4.020217  success  4.003389
              3                  3    6.003739  success  6.019589
              3                  4    7.002843  success  7.003071

@tmm1
Copy link
Contributor

tmm1 commented Mar 8, 2013

Ok, this is clearly broken in 1.0.1. There's even a failing test on master. I'll try to come up with a solution that fixes the test and the original issue in eventmachine/eventmachine#374

See also #149

@tmm1
Copy link
Contributor

tmm1 commented Mar 8, 2013

I believe this patch will fix all the timeout weirdness. Can you try your combination test again?

diff --git a/ext/ed.cpp b/ext/ed.cpp
index 9ebd4b1..48f3c5c 100644
--- a/ext/ed.cpp
+++ b/ext/ed.cpp
@@ -460,8 +460,7 @@ ConnectionDescriptor::SetConnectPending
 void ConnectionDescriptor::SetConnectPending(bool f)
 {
        bConnectPending = f;
-       if (f == false && NextHeartbeat)
-               MyEventMachine->ClearHeartbeat(NextHeartbeat, this);
+       MyEventMachine->QueueHeartbeat(this);
        _UpdateEvents();
 }

tmm1 added a commit to eventmachine/eventmachine that referenced this issue Mar 8, 2013
this reverts a change made for #374, which inadvertently disabled inactivity
timeouts altogether.

instead we always QueueHeartbeat when bConnectPending changes, so
GetNextHeartbeat can calculate a timeout based on the new connection
state. i confirmed this fixes the broken test on master, and the original
"20s delay" bug (using the repro in #393).

this should also address the timeout stacking issue reported in
igrigorik/em-http-request#149, as well as the original report in
igrigorik/em-http-request#222
@kenn
Copy link
Author

kenn commented Mar 8, 2013

@tmm1 Good news - it worked! Also, the most accurate timeout ever - results are almost exactly the same as inactivity_timeout. 👍

connect_timeout inactivity_timeout    1.0.0     1.0.1    1.0.1.1   1.0.1.2
              1                  1    2.019553  success  2.040749  1.134821
              1                  2    3.018771  success  3.003296  2.100689
              1                  3    4.002432  success  4.003432  3.098528
              1                  4    5.019328  success  5.017295  4.095405
              2                  1    2.021929  success  2.003186  1.106963
              2                  2    4.020711  success  4.003911  2.130382
              2                  3    5.023549  success  5.020934  3.09778
              2                  4    6.018697  success  6.271681  4.099069
              3                  1    2.003149  success  2.022989  1.122953
              3                  2    4.020217  success  4.003389  2.097198
              3                  3    6.003739  success  6.019589  3.096631
              3                  4    7.002843  success  7.003071  4.098878

@tmm1
Copy link
Contributor

tmm1 commented Mar 8, 2013

Excellent. I've released EM v1.0.2.

@igrigorik
Copy link
Owner

Awesome - great job guys! I take it, we can close this bug then? :)

@kenn
Copy link
Author

kenn commented Mar 8, 2013

@tmm1 @igrigorik I've just deployed EM v1.0.3 to production and confirmed the problems is fixed for real. Thank you!

@kenn kenn closed this as completed Mar 8, 2013
jperkin pushed a commit to TritonDataCenter/pkgsrc-legacy that referenced this issue Dec 9, 2013
# Changelog

## 1.0.x
* EM.system was broken in 1.0.2 release [#413]

## 1.0.2 (March 8, 2013)
* binary win32 gems now include fastfilereader shim [#222]
* fix long-standing connection timeout issues [27fdd5b,
  igrigorik/em-http-request#222]
* http and line protocol cleanups [#193, #151]
* reactor return value cleanup [#225]
* fix double require from gemspec [#284]
* fix smtp server reset behavior [#351]
* fix EM.system argument handling [#322]
* ruby 1.9 compat in smtp server and stomp protocols [#349, #315]
* fix pause from post_init [#380]

## 1.0.1 (February 27, 2013)
* use rb_wait_for_single_fd() on ruby 2.0 to fix rb_thread_select() deprecation
* fix epoll/kqueue mode in ruby 2.0 by removing calls to rb_enable_interrupt()
  [#248, #389]
* fix memory leak when verifying ssl cerificates [#403]
* fix initial connection delay [#393, #374]
jperkin pushed a commit to TritonDataCenter/pkgsrc-legacy that referenced this issue Mar 14, 2014
# Changelog

## 1.0.x
* EM.system was broken in 1.0.2 release [#413]

## 1.0.2 (March 8, 2013)
* binary win32 gems now include fastfilereader shim [#222]
* fix long-standing connection timeout issues [27fdd5b,
  igrigorik/em-http-request#222]
* http and line protocol cleanups [#193, #151]
* reactor return value cleanup [#225]
* fix double require from gemspec [#284]
* fix smtp server reset behavior [#351]
* fix EM.system argument handling [#322]
* ruby 1.9 compat in smtp server and stomp protocols [#349, #315]
* fix pause from post_init [#380]

## 1.0.1 (February 27, 2013)
* use rb_wait_for_single_fd() on ruby 2.0 to fix rb_thread_select() deprecation
* fix epoll/kqueue mode in ruby 2.0 by removing calls to rb_enable_interrupt()
  [#248, #389]
* fix memory leak when verifying ssl cerificates [#403]
* fix initial connection delay [#393, #374]
jsonn pushed a commit to jsonn/pkgsrc that referenced this issue Oct 11, 2014
# Changelog

## 1.0.x
* EM.system was broken in 1.0.2 release [#413]

## 1.0.2 (March 8, 2013)
* binary win32 gems now include fastfilereader shim [#222]
* fix long-standing connection timeout issues [27fdd5b,
  igrigorik/em-http-request#222]
* http and line protocol cleanups [#193, #151]
* reactor return value cleanup [#225]
* fix double require from gemspec [#284]
* fix smtp server reset behavior [#351]
* fix EM.system argument handling [#322]
* ruby 1.9 compat in smtp server and stomp protocols [#349, #315]
* fix pause from post_init [#380]

## 1.0.1 (February 27, 2013)
* use rb_wait_for_single_fd() on ruby 2.0 to fix rb_thread_select() deprecation
* fix epoll/kqueue mode in ruby 2.0 by removing calls to rb_enable_interrupt()
  [#248, #389]
* fix memory leak when verifying ssl cerificates [#403]
* fix initial connection delay [#393, #374]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants