Skip to content
This repository has been archived by the owner on Sep 9, 2020. It is now read-only.

Made command timeouts configurable #1028

Closed
wants to merge 2 commits into from
Closed

Conversation

ereOn
Copy link

@ereOn ereOn commented Aug 17, 2017

What does this do / why do we need it?

This PR allows users to override the various timeouts currently hard-coded within dep.

We noticed that some of our developers constantly hit the 10s default command timeout when issuing a dep ensure -update in our current development environment. Some developers with the same environment manage to get dep ensure to work, but we suspect they are very close to the 10s mark as well as the overrall command execution time is slow. It is unclear which commands exactly takes so long.

This happens mostly on Windows.

One solution would be to raise the default timeout but... what "magic" value should we use ? I suspect that even if we raised it to 30s, some other people would still have the issue on even slower systems. Hence the rationale of making it overridable with environment variables.

What should your reviewer look out for in this PR?

It should not break the current behavior.

If a user sets DEP_EXPENSIVE_CMD_TIMEOUT or DEP_DEFAULT_CMD_TIMEOUT, dep honors these timeouts accordingly.

Do you need help or clarification on anything?

Nope.

Which issue(s) does this PR fix?

None.

@ereOn ereOn requested a review from sdboyer as a code owner August 17, 2017 17:58
@googlebot
Copy link
Collaborator

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If your company signed a CLA, they designated a Point of Contact who decides which employees are authorized to participate. You may need to contact the Point of Contact for your company and ask to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the project maintainer to go/cla#troubleshoot.
  • In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again.

@ereOn
Copy link
Author

ereOn commented Aug 17, 2017

I signed it!

@googlebot
Copy link
Collaborator

CLAs look good, thanks!

Copy link
Collaborator

@carolynvs carolynvs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! This is sorely needed, thank you for submitting this! 💖

README.md Outdated
@@ -234,6 +234,15 @@ You might, for example, include a constraint in your manifest that specifies `ve
* `0.2.3` becomes the range `>=0.2.3, <0.3.0`
* `0.0.3` becomes the range `>=0.0.3, <0.1.0`

## Timeouts

`dep ensure` can sometime fail on very slow system due to underlying operations taking too long to perform. This is especially true on Windows due to filesystem issues.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just ensure that uses these timeouts, for example init triggers a ton of clones into the source cache, which can timeout.

Not sure we need to call out Windows specific, especially since we aren't linking to a known issue or anything. I've seen timeouts on all OS's when using dep. 😀

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right that the bit about Windows isn't necessary. I removed it.

README.md Outdated

If the built-in timeouts are not suitable to your environment, you can override those by setting specific environment variables:

* `DEP_EXPENSIVE_CMD_TIMEOUT`: Set the maximum number of seconds expensive operations can take. The default is 2 minutes (120 seconds).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would help to give at least one example of what an expensive or common operation may be. Otherwise it is hard to tell which one needs to be bumped.

I know that essentially anything that hits the network is expensive, such as cloning a repository, otherwise it's considered default, like getting revision information.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truth be told, I have no idea how to actually know for sure which operations are performed that happen to timeout.

dep verbose mode doesn't tell us exactly what it does and for how long it tries. Perhaps it'd be a good idea if there was a debug flag of some sort that gives that kind of information.

What do you think ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include more information in the error message when the command times out, telling them which flag they can set to override the timeout? That way it's not hidden in verbose and shows up only when it's needed (when the timeout is hit).

seconds, err := strconv.Atoi(os.Getenv(key))

if err != nil {
return def
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives no indication to the user when their environment variable is set but isn't convertible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. It might be confusing.

I don't mind making this more user-friendly. However I will put that on hold until we have consensus that the environment variables are indeed the way to go.

const (
// envExpensiveCmdTimeout is the environment variable to read for the
// expensiveCmdTimeout default, in seconds.
envExpensiveCmdTimeout = "DEP_EXPENSIVE_CMD_TIMEOUT"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently dep isn't relying upon any environment variables (other than in tests). I'm not sure if we want to start doing that, instead of continuing to use flags? I'll let @sdboyer have the final say on that though.

If we do stick with environment variables, they should be documented in the help text too.

Copy link
Author

@ereOn ereOn Aug 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually felt uncomfortable adding those environment variables where no other were currently used.

I went with those anyway because:

  • Timeouts are not something I'd like to set on each invocation of dep. If an operation takes a long time on my machine, this is likely to happen often - or even every time.
  • It somehow felt like leaking an implementation detail: if we end up having 15 sorts of timeouts in dep and all of those have to be exposed as flags (part of the CLI API so to speak), things will get ugly fast.

Perhaps a configuration file of some sort (like git has), would serve us better in that regard.

That being said, when in Rome, I'll do as the romans do so feel free to confirm or reject my decisions and I'll happily update my PR !

dep is awesome :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, I hadn't thought of that! Personally I'm fine with the environment variable. Let's move forward with that for now. 👍

The next step is to get some tests that exercise changing the timeout. I don't know how easy/hard that will end up being so if you find yourself going down a rabbit hole, post back here and we can think about it more!

@sdboyer
Copy link
Member

sdboyer commented Aug 25, 2017

(peeking out from vacation)

so, I agree that env vars are the generally correct way of doing this. but there some caveats.

the first is one that @ereOn noted - this isn't a terribly scalable approach to handling this problem. ideally, I would prefer that we improve our activity detection mechanism so as to render custom configuration unnecessary, but that's a Hard Problem™, so it's best to assume there's a ceiling on what we can achieve there, and that it may not be adequate.

the other problem is more mundane software design: this makes gps directly aware of its environment. in general, we try very hard to encapsulate gps, requiring the implementing application (here, dep) to pass on all configuration options, like this one.

the problem with that is, there's no vector for this information to make it into gps right now. the only way we could do it, realistically, is to use context.Context values. I generally find those repugnant, but this situation might merit them. I need to cogitate on that a little more.

I suspect I'll talk myself into it, though, which means it's worth following @carolynvs' advice and thinking about tests for this 😁

@sdboyer
Copy link
Member

sdboyer commented Sep 6, 2017

bit of an update - i think we're actually just going to get rid of these timeouts in a lot of cases. they're doing more harm than good. it's a minor refactor, but not awful.

@sdboyer
Copy link
Member

sdboyer commented Sep 11, 2017

yeah, gonna close this in favor of #1110. thanks for the attempt at this, anyway!

@sdboyer sdboyer closed this Sep 11, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants