Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubernetes_connect: add timeout settings #10

Merged
merged 1 commit into from
May 15, 2017

Conversation

cben
Copy link
Contributor

@cben cben commented May 1, 2017

Ability to constrol kubeclient timeouts from settings.yml

openshift half: ManageIQ/manageiq-providers-openshift#8 unnecessary on master given ManageIQ/manageiq-providers-openshift#7, but will add it in backports.

Tested:

  • Exercised in rails console and refresh, events, SSA in manageiq, confirmed low enough values cause timeouts.
  • core manageiq tests passed locally with this, but I don't think they cover anything relevant.

https://bugzilla.redhat.com/show_bug.cgi?id=1440950
@miq-bot add-label euwe/yes, fine/yes (actually we plan a 5.7.1 hotfix but I suppose we also want normal backports, so customer uprading will not lose hotfix functionality?)

@moolitayer @agrare Please review.

@miq-bot
Copy link
Member

miq-bot commented May 1, 2017

@cben Cannot apply the following labels because they are not recognized: fine/yes (actually we plan a 5.7.1 hotfix but i suppose we also want normal backports, so customer uprading will not lose hotfix functionality?)

@cben
Copy link
Contributor Author

cben commented May 1, 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1440950
@miq-bot add-label euwe/yes, fine/yes

@moolitayer
Copy link

@cben will our gem dependency in euwe bring in the new gem automatically or will we need to upgrade it?

Copy link
Member

@agrare agrare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cben
Copy link
Contributor Author

cben commented May 1, 2017

we have strict =2.3.0 on euwe, fine and master, will need a gems-pending PR and backport it.

@agrare
Copy link
Member

agrare commented May 8, 2017

@cben any luck getting a new gem released?

@simon3z
Copy link
Contributor

simon3z commented May 10, 2017

@cben any luck getting a new gem released?

@cben @agrare gem is released here: https://rubygems.org/gems/kubeclient/versions/2.4.0

@cben cben changed the title [WIP] kubernetes_connect: add timeout settings kubernetes_connect: add timeout settings May 10, 2017
@miq-bot miq-bot removed the wip label May 10, 2017
@cben cben closed this May 10, 2017
@cben cben reopened this May 10, 2017
@cben cben mentioned this pull request May 10, 2017
@@ -3,6 +3,8 @@
:ems_kubernetes:
:event_handling:
:event_groups:
:open_timeout: 60.seconds
:read_timeout: 60.seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cben @agrare wasn't the default 2 minutes? Are we shortening this timeout by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default read_timeout in ruby / restclient was always 60. (open_timeout was infinite in ruby 2.2 but afaict euwe appliance was already ruby 2.3)

I was never able to account for the "2 minutes" number.

  • "60 + 60 = 120" is not a plausible explanation, doesn't take a minute to establish tcp + tls.
  • Curl showed server takes >2min but that doesn't mean anything, client may timeout after 1min.
  • IIRC the source for us believing we timeout at 2 minutes is log lines ~2min apart. But we don't have per-request log lines, we have something like "start of refresh – error = 2min"... there are many requests before images.

Copy link
Contributor Author

@cben cben May 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Darn. Customer log strongly suggests timeout was 2min.
It contains >70 timeouts, and times are very stable: 9–12sec from first connect (/api) to second connect (/oapi) then 126–128sec to timeout.
This agrees perfrect with per-request timing in VCR: all /api requests total 11sec, /oapi requests without images total 7sec.

Gonna simulate/reproduce a slow server and measure actual timeout before & after patch...

Copy link
Contributor Author

@cben cben May 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=> It takes 2 minutes on euwe with old kubeclient too. Nothing changed here.
The reason turned out to be that ruby's Net::HTTP unconditionally retries requests that are supposed to be idempotent (e.g. GET, DELETE but not POST)
[ankane/the-ultimate-guide-to-ruby-timeouts#8, https://bugs.ruby-lang.org/issues/10674]

@simon3z ready for merge.

@simon3z
Copy link
Contributor

simon3z commented May 11, 2017

@cben let me know when this is ready.

@miq-bot
Copy link
Member

miq-bot commented May 11, 2017

This pull request is not mergeable. Please rebase and repush.

:http_proxy:
:kubernetes:
:host:
:password:
:port:
:user:
:container_scanning:
:scanning_job_timeout: 20.minutes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cben really? 😮 I guess you needed a more careful rebase 😊

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoa thanks fixed.
need to revise my mergetool config, meld without --auto-merge is error prone...

@miq-bot
Copy link
Member

miq-bot commented May 15, 2017

Checked commit cben@e74b251 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
2 files checked, 0 offenses detected
Everything looks fine. ⭐

@simon3z simon3z merged commit 1ee90b5 into ManageIQ:master May 15, 2017
cben added a commit to cben/manageiq that referenced this pull request May 15, 2017
ManageIQ/manageiq-providers-kubernetes#10 from cben/kubeclient-timeout

kubernetes_connect: add timeout settings
(cherry picked from merge commit ManageIQ/manageiq-providers-kubernetes@1ee90b5)

openshift_connect: use kubernetes timeout settings
(cherry picked from unmerged
ManageIQ/manageiq-providers-openshift#8 - unnecessary on master but required in backports)

Requires kubeclient >= 2.4.0
cben added a commit to cben/manageiq that referenced this pull request May 15, 2017
  bump kubeclient ~> 2.4.0
  (ported from manageiq-gems-pending.gemspec to gems/pending/Gemfile)

- Merge ManageIQ/manageiq-providers-kubernetes#10
  kubernetes_connect: add timeout settings
  (cherry picked from merge commit ManageIQ/manageiq-providers-kubernetes@1ee90b5)

- openshift_connect: use kubernetes timeout settings
  (cherry picked from unmerged ManageIQ/manageiq-providers-openshift#8 -
  unnecessary on master but required in backports)
@simaishi
Copy link

@cben Marking as euwe/conflict for now. Please remove the conflict label when you have Euwe PR. Thanks!

@cben
Copy link
Contributor Author

cben commented May 22, 2017

@miq-bot remove-label euwe/conflict
ManageIQ/manageiq#15188

@simaishi
Copy link

Backported to Euwe via ManageIQ/manageiq#15188

@simaishi
Copy link

simaishi commented Jun 8, 2017

Backported to Fine via ManageIQ/manageiq#15090

@moolitayer moolitayer added this to the Sprint 61 Ending May 22, 2017 milestone Aug 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants