Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flux Source Controller Fails to List Remotes #1137

Open
1 task done
devopstagon opened this issue Jun 20, 2023 · 2 comments
Open
1 task done

Flux Source Controller Fails to List Remotes #1137

devopstagon opened this issue Jun 20, 2023 · 2 comments

Comments

@devopstagon
Copy link

Describe the bug

Source controller randomly has issues listing revisions from the remote(GitLab in this case) leading to these errors:

{"level":"error","ts":"2023-06-20T12:09:39.735Z","msg":"failed to checkout and determine revision: unable to list remote for 'https://gitlab/sre/gitops/sre-flux': stream error: stream ID 3; INTERNAL_ERROR; received from peer","controller":"gitrepository","controllerGroup":"source.toolkit.fluxcd.io","controllerKind":"GitRepository","GitRepository":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"e258ec4f-35e2-48e5-9af2-f7715f7c4cb4","error":"failed to checkout and determine revision: unable to list remote for 'https://gitlab/sre/gitops/sre-flux': stream error: stream ID 3; INTERNAL_ERROR; received from peer"}
{"level":"error","ts":"2023-06-20T12:09:39.766Z","msg":"Reconciler error","controller":"gitrepository","controllerGroup":"source.toolkit.fluxcd.io","controllerKind":"GitRepository","GitRepository":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"e258ec4f-35e2-48e5-9af2-f7715f7c4cb4","error":"failed to checkout and determine revision: unable to list remote for 'https://gitlab/sre/gitops/sre-flux': stream error: stream ID 3; INTERNAL_ERROR; received from peer"}

The endpoint it calls is up and has no connection issues we can see during this period. We suspect it is a bug in net/http due to this ticket: golang/go#51323

Steps to reproduce

  1. add a source
  2. check the logs and see the intermittent failures

Expected behavior

Source controller handles this error via retries or something instead of failing to get around the bug.

Screenshots and recordings

No response

OS / Distro

Kubernetes 1.24.x

Flux version

v0.38.3

Flux check

► checking prerequisites
✗ flux 0.38.3 <2.0.0-rc.5 (new version is available, please upgrade)
✔ Kubernetes 1.24.12-gke.500 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.34.1
✔ image-automation-controller: deployment ready
► ghcr.io/fluxcd/image-automation-controller:v0.34.1
✔ image-reflector-controller: deployment ready
► ghcr.io/fluxcd/image-reflector-controller:v0.28.0
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v1.0.0-rc.4
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v1.0.0-rc.4
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v1.0.0-rc.5
► checking crds
✔ alerts.notification.toolkit.fluxcd.io/v1beta2
✔ buckets.source.toolkit.fluxcd.io/v1beta2
✔ gitrepositories.source.toolkit.fluxcd.io/v1
✔ helmcharts.source.toolkit.fluxcd.io/v1beta2
✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1
✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2
✔ imagepolicies.image.toolkit.fluxcd.io/v1beta2
✔ imagerepositories.image.toolkit.fluxcd.io/v1beta2
✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta1
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1
✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2
✔ providers.notification.toolkit.fluxcd.io/v1beta2
✔ receivers.notification.toolkit.fluxcd.io/v1
✔ all checks passed

Git provider

GitLab

Container Registry provider

Harbor

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@makkes
Copy link
Member

makkes commented Jun 20, 2023

According to this comment, the internal error message you're seeing is coming from the server, so it is most likely to be an upstream issue.

@stefanprodan stefanprodan transferred this issue from fluxcd/flux2 Jun 29, 2023
@savisaar2
Copy link

@devopstagon Did you manage to solve this issue, I have started seeing this error appear on my cluster coming from source-controller. Unsure why its having a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants