Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: default registry dial timeout could be not enough for some clusters #1270

Conversation

denis-tingaikin
Copy link
Member

Signed-off-by: Denis Tingaikin denis.tingajkin@xored.com

Description

Reported by @damiankopyto in the nsm slack

We face with this on the real production cluster

Apr 20 19:40:35.817�[37m [TRAC] [type:registry] �[0m(5.1)       find={"network_service":{"name":"nse-composition","payload":"ETHERNET"}}
Apr 20 19:40:35.992�[31m [ERRO] [type:registry] �[0m(5.2)       can not dial to registry:5002, err failed to dial registry:5002: context deadline exceeded. Deleting clientconn...
Apr 20 19:40:35.994�[31m [ERRO] [type:registry] �[0m(5.3)       failed to dial registry:5002: context deadline exceeded
Apr 20 19:40:35.994�[31m [ERRO] [type:registry] �[0m(4.2)      failed to dial registry:5002: context deadline exceeded
Apr 20 19:40:35.994�[31m [ERRO] [type:registry] �[0m(3.2)     failed to dial registry:5002: context deadline exceeded
Apr 20 19:40:35.994�[31m [ERRO] [type:registry] �[0m(2.3)    failed to dial registry:5002: context deadline exceeded
Apr 20 19:40:35.994�[31m [ERRO] [type:registry] �[0m(1.2)   failed to dial registry:5002: context deadline exceeded

We also tried to remove CPU limits for the nsmgr but it did not help to @damiankopyto

Issue link

How Has This Been Tested?

  • Added unit testing to cover
  • Tested manually
  • Tested by integration testing
  • Have not tested

Types of changes

  • Bug fix
  • New functionallity
  • Documentation
  • Refactoring
  • CI

Signed-off-by: Denis Tingaikin <denis.tingajkin@xored.com>
@edwarnicke edwarnicke merged commit 235f8a1 into networkservicemesh:main Apr 26, 2022
nsmbot pushed a commit to networkservicemesh/cmd-map-ip-k8s that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-nsmgr that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-registry-proxy-dns that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/sdk-kernel that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-ipam-vl3 that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/sdk-k8s that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-registry-memory that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-nse-vfio that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-nse-remote-vlan that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-admission-webhook-k8s that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-nsc-init that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-nsmgr-proxy that referenced this pull request Apr 26, 2022
…k@main

PR link: networkservicemesh/sdk#1270

Commit: 235f8a1
Author: Denis Tingaikin
Date: 2022-04-27 02:53:54 +0300
Message:
  - fix: default registry dial timeout is not enough for production clusters (#1270)
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
denis-tingaikin added a commit that referenced this pull request Apr 29, 2022
…ers (#1270)

Signed-off-by: Denis Tingaikin <denis.tingajkin@xored.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants