Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider making Timeout option required or have a default #3070

Closed
mlissner opened this issue Mar 29, 2016 · 62 comments · May be fixed by #6709
Closed

Consider making Timeout option required or have a default #3070

mlissner opened this issue Mar 29, 2016 · 62 comments · May be fixed by #6709

Comments

@mlissner
Copy link
Contributor

mlissner commented Mar 29, 2016

I have a feeling I'm about to get a swift education on this topic, but I've been thinking about the pros/cons of changing requests so that somehow there is a timeout value configured for every request.

I think there are two ways to do this:

  1. Provide a default value. I know browsers have a default, so that may be a simple place to begin.
  2. Make every user configure this in every request -- bigger API breakage. Probably not the way to go.

The reason I'm thinking about this is because I've used requests for a few years now and until now I didn't realize the importance of providing a timeout. It took one of my programs hanging forever for me to realize that the default here isn't really very good for my purposes. (I'm in the process of updating all my code...)

I also see that a lot of people want Session objects to have a timeout parameter, and this might be a way to do that as well.

If a large default were provided to all requests and all sessions, what negative impact would that have? The only thing I can think of is that some programs will get timeout exceptions where they previously hung, which seems like an improvement to me.

Caveat, added May 13, 2016:

Please don't use this issue to discuss adding a timeout attribute to requests. There are a number of discussions about this elsewhere (search closed issues), and we don't want that conversation to muddy this issue too. Thanks.

@Lukasa
Copy link
Member

Lukasa commented Mar 29, 2016

This is an entirely reasonable suggestion.

Honestly, I'm not averse to doing it: there are definitely worse things to do than this. I'd be open to providing a default timeout. However, I don't think we can do it until 3.0.0.

Of course @kennethreitz as keeper of the spirit of requests has got the final say on this, and should definitely weigh in.

@sigmavirus24
Copy link
Contributor

Changing the default would definitely need to wait until 3.0.0.

I'm not against having a well-reasoned (and therefore, also well-documented) default timeout parameter.

I'm still pretty strongly against sessions having a timeout attribute though and I think that shouldn't muddy the waters of this discussion because it will sidetrack things and prevent an otherwise productive discussion.

@mlissner
Copy link
Contributor Author

@sigmavirus24 I don't know the argument pro/con sessions having a timeout, but I'm happy to defer to other's judgement and experience on that, keeping these waters clean, like you say.

I'm just happy that I haven't gotten a swift education about timeouts.

@justanr
Copy link

justanr commented Apr 3, 2016

I'm definitely in favor of this in light of dealing with a package that doesn't provide a way to set a timeout, but does accept a fully configured session. My current fix is a specialized HTTPAdapter that sets a default timeout if none is provided:

class TimeoutHTTPAdapter(HTTPAdapter):
    def __init__(self, timeout, *args, **kwargs):
        self._timeout = timeout
        super().__init__(*args, **kwargs)

    def send(self, request, timeout=False, ...):
        if timeout is None:
            timeout = self._timeout
        return super().send(request, timeout=timeout, ...)

I'd much prefer something like:

session = Session(timeout=...)

# or even

session.default_timeout = ...

However, this current fix isn't too cumbersome either.

@kuraga
Copy link

kuraga commented May 13, 2016

👍 for session = Session(timeout=...). Would you merge a patch?

@Lukasa
Copy link
Member

Lukasa commented May 13, 2016

@kuraga No. Per @sigmavirus24, and in many many previous discussions:

I think I could actually still implement it using hooks - create a method which uses the prepared request workflow, and in the hook, just call it again. What would be your suggested solution?

@kuraga
Copy link

kuraga commented May 13, 2016

@Lukasa ok, but which discussion did you cite? Was it private? Which previous discussions?

@Lukasa
Copy link
Member

Lukasa commented May 13, 2016

Argh, sorry, copy-paste fail:

I'm still pretty strongly against sessions having a timeout attribute though and I think that shouldn't muddy the waters of this discussion because it will sidetrack things and prevent an otherwise productive discussion.

@sigmavirus24
Copy link
Contributor

@kuraga please search closed issues. There are several discussions of this. I really don't want this diversion to distract from the topic of a default timeout though. So can we please stop discussing this now as I asked nicely before. You and @justanr are distracting from the important and attainable portion of this issue.

@mlissner
Copy link
Contributor Author

@sigmavirus24 I added a caveat to my initial comment, above. Hopefully that should get this discussion focused.

@pyhedgehog
Copy link

I'm (as a user) slightly against this proposal.
There are following caveats:

  1. There should be several different time-outs: resolve, connect, request, read. What should be defaults for each of them?
  2. If there will be default read time-out, then (for streaming interfaces) there should also be keep-alive packets. Is this implemented?
  3. There must be way to override default and disable time-outs completely.

What I've missed?

@chris-martin
Copy link
Contributor

chris-martin commented Nov 11, 2016

At the very least, how about we include the timeout arg in the first introductory examples in the documentation? It's bad when intro examples are subtly wrong for the sake of brevity.

@Lukasa
Copy link
Member

Lukasa commented Nov 12, 2016

It's extremely unclear to me what "wrong" means here. They aren't wrong: they work, as designed. The introductory examples are all clearly typed into the Python interactive interpreter, so they have a timeout: the user sitting there can hit ^C anytime they like.

I'd welcome enhancements to the "timeouts" section of the documentation to be more emphatic, that certainly does seem reasonable. But I don't think we need to go through and add the timeout arg.

@mlissner
Copy link
Contributor Author

It's extremely unclear to me what "wrong" means here. They aren't wrong: they work, as designed.

I think the point that's generally acknowledged by this bug is that the design was wrong. There's a general consensus here that adding a default timeout or requiring it as an argument is a good idea. The docs could address that, and that seems like a simple step to take until this issue is resolved in the next major version.

What happens in practice is that people grab the examples from the docs, don't read the timeout section carefully (or don't understand its implications -- I didn't for years until I got bit), and then wind up with programs that can hang forever. I completely agree with @chris-martin that until this issue is fixed, all examples in the docs should provide the timeout argument. Otherwise, we're providing examples that can (and probably will) break your programs.

@Lukasa
Copy link
Member

Lukasa commented Nov 13, 2016

There is a substantial difference between "is a good idea in production code" and "should be used in every example". For example, all production code should use Sessions for everything, but the documentation doesn't do that because it doesn't help teach the specific lessons that the documentation is intended to teach.

While I appreciate the concern that users will blindly copy code out of the documentation without further consideration, anyone doing that in their product is necessarily going to have sub-par results. This is true in all programming tools.

@mlissner
Copy link
Contributor Author

That's fair enough, @Lukasa. What about making the Timeout section more explicit then? Right now it says:

You can tell Requests to stop waiting for a response after a given number of seconds with the timeout parameter.

And then goes on with a long, somewhat complicated warning. Could the first part say:

You can tell Requests to stop waiting for a response after a given number of seconds with the timeout parameter. Nearly all production code should use this parameter in nearly all requests. Failure to do so can cause your program to hang indefinitely:

Something like that?

@Lukasa
Copy link
Member

Lukasa commented Nov 14, 2016

Yup, I'd be happy to merge a PR with that change in it. 😄

@sigmavirus24
Copy link
Contributor

I think that we've addressed this through documentation. I don't think any of the core team is going to change the API to require a timeout and I think we've done our diligence here.

@mlissner
Copy link
Contributor Author

mlissner commented Jul 30, 2017

I doubt this will change anybody's mind, but I vehemently disagree with closing this bug without a fix. We improved the documentation via the PR that I filed, but in practice I regularly run into programs that hang because they do not have a timeout. It's gotten bad enough that it's one of the first things I think of when a program hangs.

Requests is a wonderful library but you can't document your way out of a problem like this. Look at the list of issues referencing this one. Documentation is just not enough, or else issues would stop referencing this one.

I respect the requests maintainers greatly, but I hope you'll reconsider closing this. If it causes API changes, that's what version bumps are for. But I'm not even sure API changes would be needed.

Please reconsider.

@kennethreitz
Copy link
Contributor

We could consider making a default in 3.0, but I don't know what it would be... 120s?

Idk, I like our current design. You really shouldn't be hitting the internet without proper timeouts in place ever in production. Any sane engineer knows that — it's not Requests' job to do your job for you.

@chris-martin
Copy link
Contributor

chris-martin commented Jul 31, 2017

Here are a few example idle timeout defaults from other libraries, in case that helps:

@tobixen
Copy link

tobixen commented Aug 5, 2022

What if there was a way to set a default using env variables. Then people could opt in to the default, and it would be easier to also fix programs in production without having to dig through all code looking for libraries that might be using requests without the default. I was thinking something similar to how the http proxy has env variables.

This one got quite some negative thumbs, but I think it's a good idea. The requests library already relies on the environment, i.e. it honors a .netrc and it honors proxy settings - and it has a trust_env parameter that can be set to False to disregard environment settings.

Such a solution may empower quite some end-users. Some of the problem with this issue is that there is no strong consensus on what the default timeout should be - but this problem persists even if it's pushed to the users of the library. Like, I have my caldav library, and I could consider setting some default timeout there - but then I have the same issue, what would the sane timeout be? This library is again used by other libraries and utilities. Sometimes the end user is the one to learn if the timeout setting is too low or too high. The timeout may be passed through the full chain and be specified by a configuration file for each tool out there - but it adds a lot of complexity. Putting an environment variable there is a simple solution that works for (nearly) all projects.

A default timeout given through environment variables may solve quite some problems. It is not a breaking change, users relying on the default timeout to be infinite won't be affected - hence it can be rolled out whenever and without DeprecationWarnings. However, it does not solve this issue. I strongly believe that the default should be set to something else than "infinite" in the long-term future. I've used the requests library a lot, and I always thought that such a high-level library would come with some kind of default timeout ... first time I encountered this issue my script had been hanging for days. I think even a very high default timeout, like 24h, is better than infinite.

ccwienk added a commit to gardener/cc-utils that referenced this issue Jan 5, 2024
As suggested by `requests`'s documentation [0], define time-outs for
http-requests. This change was inspired/suggested by bandit [1][2]. This
change addresses all findings of `medium` severity. In addition, switch
to using `requests.sessions.Session` where reasonable (this is the case
if there is a reasonable likelihood for a session to actually be used
more than once). Also, refactor some quirky code.

Considerations w.r.t. redundantly defining timeout parameter
------------------------------------------------------------

As discussed on `requests`'s GitHub-Repository (e.g. at [3]), there is
no means (except, of course, through "monkey-patching") to define a
default timeout, neither globally, nor e.g. for a session. While it is
an option to implement a custom transport-adaptor (and use it to inject
timeout parameter into issued requests), this approach seems rather
heavyweight, and will add a lot of boilerplate code.

Therefore, it seems like the best available option to pass `timeout`
parameter to each individual invocation of `requests`'s
"request-issueing-functions" (request, get, put, ...).

One might feel the urge to deduplicate the specified timeout values.
However, there are reasons against this:
- will also add boilerplate (one extra import stmt)
- it is unlikely that timeouts will ever (have to) be changed (more
  likely: specific individual timeouts might require "tuning" - which is
  very easy with the chosen approach)

Therefore - assuming the chosen de-facto default timeout value proves to
be "good enough" - it seems like an adequate choice to redundantly
define timeout values.

[0]
https://requests.readthedocs.io/en/latest/user/advanced/#timeouts
[1]
https://bandit.readthedocs.io/en/1.7.6/plugins/b113_request_without_timeout.html
[2]
https://bandit.readthedocs.io/
[3]
psf/requests#3070
ccwienk added a commit to gardener/cc-utils that referenced this issue Jan 8, 2024
As suggested by `requests`'s documentation [0], define time-outs for
http-requests. This change was inspired/suggested by bandit [1][2]. This
change addresses all findings of `medium` severity. In addition, switch
to using `requests.sessions.Session` where reasonable (this is the case
if there is a reasonable likelihood for a session to actually be used
more than once). Also, refactor some quirky code.

Considerations w.r.t. redundantly defining timeout parameter
------------------------------------------------------------

As discussed on `requests`'s GitHub-Repository (e.g. at [3]), there is
no means (except, of course, through "monkey-patching") to define a
default timeout, neither globally, nor e.g. for a session. While it is
an option to implement a custom transport-adaptor (and use it to inject
timeout parameter into issued requests), this approach seems rather
heavyweight, and will add a lot of boilerplate code.

Therefore, it seems like the best available option to pass `timeout`
parameter to each individual invocation of `requests`'s
"request-issueing-functions" (request, get, put, ...).

One might feel the urge to deduplicate the specified timeout values.
However, there are reasons against this:
- will also add boilerplate (one extra import stmt)
- it is unlikely that timeouts will ever (have to) be changed (more
  likely: specific individual timeouts might require "tuning" - which is
  very easy with the chosen approach)

Therefore - assuming the chosen de-facto default timeout value proves to
be "good enough" - it seems like an adequate choice to redundantly
define timeout values.

[0]
https://requests.readthedocs.io/en/latest/user/advanced/#timeouts
[1]
https://bandit.readthedocs.io/en/1.7.6/plugins/b113_request_without_timeout.html
[2]
https://bandit.readthedocs.io/
[3]
psf/requests#3070
ccwienk added a commit to gardener/cc-utils that referenced this issue Jan 8, 2024
As suggested by `requests`'s documentation [0], define time-outs for
http-requests. This change was inspired/suggested by bandit [1][2]. This
change addresses all findings of `medium` severity. In addition, switch
to using `requests.sessions.Session` where reasonable (this is the case
if there is a reasonable likelihood for a session to actually be used
more than once). Also, refactor some quirky code.

Considerations w.r.t. redundantly defining timeout parameter
------------------------------------------------------------

As discussed on `requests`'s GitHub-Repository (e.g. at [3]), there is
no means (except, of course, through "monkey-patching") to define a
default timeout, neither globally, nor e.g. for a session. While it is
an option to implement a custom transport-adaptor (and use it to inject
timeout parameter into issued requests), this approach seems rather
heavyweight, and will add a lot of boilerplate code.

Therefore, it seems like the best available option to pass `timeout`
parameter to each individual invocation of `requests`'s
"request-issueing-functions" (request, get, put, ...).

One might feel the urge to deduplicate the specified timeout values.
However, there are reasons against this:
- will also add boilerplate (one extra import stmt)
- it is unlikely that timeouts will ever (have to) be changed (more
  likely: specific individual timeouts might require "tuning" - which is
  very easy with the chosen approach)

Therefore - assuming the chosen de-facto default timeout value proves to
be "good enough" - it seems like an adequate choice to redundantly
define timeout values.

[0]
https://requests.readthedocs.io/en/latest/user/advanced/#timeouts
[1]
https://bandit.readthedocs.io/en/1.7.6/plugins/b113_request_without_timeout.html
[2]
https://bandit.readthedocs.io/
[3]
psf/requests#3070
@sethmlarson sethmlarson modified the milestones: 3.0.0, Bankruptcy May 19, 2024
@sethmlarson
Copy link
Member

In an effort to clean up the issue tracker to only have issues that are still relevant to the project we've done a quick pass and decided this issue may no longer be relevant for a variety of potential reasons, including:

  • Applies to a much older version, unclear whether the issue still applies.
  • Change requires a backwards incompatible release and it's unclear if the benefits are worth the migration effort from the community.
  • There isn't a clear demand from the community on the change landing in Requests.

If you think the issue should remain open, please comment so below or open a new issue and link back to the original issue. Again, thank you for opening the issue and for the discussion, it's much appreciated.

@mlissner
Copy link
Contributor Author

mlissner commented May 20, 2024

I still feel strongly that requests needs a default timeout. I don't think that has changed or been fixed, but I recognize that nobody wants to pick it up and run with it.

It could cause some minor backwards incompatibility, but fixing this would take requests from freezing your program when used incorrectly to crashing it when used incorrectly, which I consider a fix, not breaking compatibility.

I can't get too revved up about this because I'm not the one fixing it, but I do find the absence of a default timeout to be a source of bugs about 20% of the time I see requests in the wild.

@sigmavirus24
Copy link
Contributor

I find made up percentages pulled from no real data to be a source of wasted time 90% of the time

@mlissner
Copy link
Contributor Author

Well, I think you know what I mean: I see people forgetting to use the timeout parameter pretty regularly, and when they do it's a bug. No, I don't have metrics on this, but would you agree it's an issue in the wild?

@mschfh
Copy link

mschfh commented May 20, 2024

I have seen the lack of a default timeout in requests cause multiple production outages, including due to third-party libraries from major Cloud Providers not using it correctly.

This caused more incidents than there are use cases for not having a timeout (and in this case it can be disabled explicitly).

As noted above (almost 7 years ago), this completely disregards the practical needs of users (especially for a library that claims to abstract complexity):

I think that the current situation kind-of contradicts the stated project goal "HTTP Requests for Humans".

Humans need sensible defaults. [..] And "no timeout" is almost never a sensible default.

@earonesty
Copy link

maybe another solution would be to warn if there's no timeout, so people can be harassed into providing one, but the default is still the backward compatible "randomly hang forever if one is not provided"

@mlissner
Copy link
Contributor Author

Earlier in the thread we discussed adding a warning (it's welcome), and after seven years, I don't think anybody in this issue has identified why they would want no timeout at all. The closest we've come is a concern about:

the impact it would have on [...] existing infrastructure.

That's it, I think. I don't think this need to be a controversial fix.

My personal suggestion would be to implement a fix and add a release note (example below). To me, this is broken functionality, and a default timeout fixes it for users. Maybe I'm missing something, but I don't think I am.

If we want to be more conservative, we can take additional steps which add more work, but give people more time and opportunity to adapt:

  • Add a deprecation warning for some period of time first.
  • Phase in the timeout starting at 3600s, then 1800s, then 900s, then 450s...until we get to 60s.

But I think all of that is unnecessary and that we can just take a middle path:

  1. In the next version: Add a warning if no timeout is provided.

    The one mentioned above has a blessing of a maintainer:

    DeprecationWarning: No timeout provided for request. Starting in version XYZ, the requests library will provide a default timeout for such requests. For more information, see: https://github.com/psf/requests/issues/3070

  2. Release the next version with that warning.

  3. Add a long default timeout of 60s to match Ruby and Scala.

  4. Add a release note that says:

    Fix: Default timeouts of 60 seconds have been provided to all requests. Programs that previously hung indefinitely or connected after extremely long periods of time will now crash with requests.exceptions.TimeoutException instead. To upgrade these programs, provide a longer timeout via the timeout parameter to the get, post, head, and options methods.

  5. Release another version.

I really think that would help the vast majority of users and that whatever breakage it caused would be minimal. (Happy to hear contrarian views though!)

@sigmavirus24
Copy link
Contributor

Bandit already warns on lack of a timeout. Anyone with any real care about security of their code should be using it as a static analysis tool.

As for a warning Requests emits, people already silence all warnings from the library so they likely wouldn't see a warning about not specifying a timeout

@mlissner
Copy link
Contributor Author

mlissner commented May 21, 2024

That's good news, right? That means that the most serious users aren't a concern, and we only have to worry about folks that are less serious, where any breaking impact — if any — would be less of a big deal.

I agree most people don't see warnings, so maybe there's not much point in doing them (I know I don't). If that's the case, I still think the upside of just adding a default timeout in the next release is an OK way to go.

I appreciate you engaging in this discussion again. If a default timeout is added in the next release, who do you think would be worse off? After all these years and so much engagement, I still can't think of anybody — is this unnecessary conservatism?

@mschfh
Copy link

mschfh commented May 21, 2024

There was a PR to add warnings (unfortunately it was rejected for various reasons): #5434 (comment)
An env var to suppress the warning seems like a reasonable solution to address the log pollution concern.

The rationale for rejecting the PR actually underscores the need for the warning and a default timeout, as it acknowledges how widespread the issue is (if those projects would be setting proper timeouts, there wouldn't be any warnings):

It produces a considerable amount of noise in people's systems and we've generally stopped using warnings due to how often Requests is bundled in other projects

This is precisely what needs to be addressed, tracking this type of issue down in third-party libraries is difficult, it won't emit any logs, reproducability is limited and it often ends up requiring attaching gdb for a backtrace (which is arguably not what a "simple" http library should expect from the end user).

@mlissner
Copy link
Contributor Author

mlissner commented May 21, 2024

I don't want to pile on, and I know I'm chiming in a lot, but my passion for this issue stems from when I wrote celery tasks that used requests without a timeout. Over time, my celery workers would all slowly get mysteriously tied up, and I'd have to restart Celery. It took me forever to figure out why because I thought it was a celery bug. Nope! I'd have 1000× preferred if my code crashed or timed out.

@tobixen

This comment has been minimized.

@psf psf locked and limited conversation to collaborators May 21, 2024
sigmavirus24 added a commit to sigmavirus24/requests that referenced this issue May 21, 2024
This adds a default connect and read timeout value for all usage of
Requests. This is to solve a long-standing issue where some systems do
not have a sufficiently low default value.

Personally, I'd want these values to be much lower, but a 10 second
connection timeout and a 30 second read timeout seem like they should be
enough to avoid problems for the edge cases of users while also not
being so large that they're basically ineffective.

Closes psf#3070
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.