Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on multi-source apps v1.8.0 #297

Closed
audrey-mux opened this issue Oct 21, 2024 · 11 comments
Closed

Panic on multi-source apps v1.8.0 #297

audrey-mux opened this issue Oct 21, 2024 · 11 comments
Labels
bug Something isn't working

Comments

@audrey-mux
Copy link

Ran across a panic when trying to process a multi-source app. We just started using those and this is the second one so far, the first one just produced a warning w/o the panic

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x2565b90]

goroutine 850 [running]:
github.com/zapier/kubechecks/pkg/events.(*worker).processApp(_, {_, _}, {{{0x398691d, 0xb}, {0x39a788d, 0x14}}, {{0xc004596dc8, 0x18}, {0x0, ...}, ...}, ...})
	github.com/zapier/kubechecks//pkg/events/worker.go:59 +0x90
github.com/zapier/kubechecks/pkg/events.(*worker).run(0xc002943808, {0x43c93d8, 0xc004761830})
	github.com/zapier/kubechecks/pkg/events/worker.go:39 +0x170
created by github.com/zapier/kubechecks/pkg/events.(*CheckEvent).Process in goroutine 785
	github.com/zapier/kubechecks/pkg/events/check.go:284 +0x5d5
@djeebus
Copy link
Collaborator

djeebus commented Oct 21, 2024

Looks like there are some core chunks of the app that are not aware of multiple sources. Might not be too difficult to take into account. 🤞

@djeebus djeebus added the bug Something isn't working label Oct 21, 2024
@djeebus
Copy link
Collaborator

djeebus commented Oct 22, 2024

For some reason I remember this was implemented, at least partially, but that doesn't seem to be true. OK, I built on the work that @sl1pm4t did and got a working branch. I don't have any multi source applications to test on yet, but if you want a sneak preview, you can use the image at ghcr.io/zapier/kubechecks:0.0.0-pr298, and report back if it works as intended. Thanks for the bug report!

@audrey-mux
Copy link
Author

i'll give it a shot

@audrey-mux
Copy link
Author

I tried that build out in my test instance with a new multi-source app - no crash, but i'm seeing permissions errors ( even with a rollback to 1.8.0 ), so i think i can tentatively say its working. but i want to figure out this permission issue before i'd say totally fixed

@audrey-mux
Copy link
Author

audrey-mux commented Oct 23, 2024

Permission issue fixed

given that this is multi-source the helm chart itself is pulling from a remote repo (generally, we've pull them into our monorepo and haven't had an issue), it looks like Kubechecks isn't able to pull this from this remote helm repo

Screenshot 2024-10-23 at 10 28 34

5:19PM INF cloning repo branchName=1.4.3 event_id=50 repo=repo-name-redacted
5:19PM INF cloning git repo branch=1.4.3 clone-url=https://audrey-mux@grafana.github.io/helm-charts temp-dir=/tmp/kubechecks-repo-489429976
5:19PM DBG building command args=["clone","https://audrey-mux@grafana.github.io/helm-charts","/tmp/kubechecks-repo-489429976","--branch","1.4.3"]
5:19PM DBG getting cluster clusterName= clusterServer=https://kubernetes.default.svc
5:19PM DBG generating diff for application... name=cluster-bootstrap-apps
5:19PM DBG for appname cluster-bootstrap-apps, server dest says: https://kubernetes.default.svc and name dest says: 
5:19PM ERR unable to clone repository, Cloning into '/tmp/kubechecks-repo-489429976'...
fatal: repository 'https://grafana.github.io/helm-charts/' not found
 error="exit status 128"
5:19PM ERR Unable to clone repository error="failed to clone repo: failed to clone repository: exit status 128" app_name=<redacted>-beyla app_path= event_id=50 repo=repo-name-redacted workerID=15

and the appset sources block:

      sources:
        - repoURL: https://grafana.github.io/helm-charts
          chart: beyla
          targetRevision: 1.4.3
          helm:
            releaseName: beyla
            valueFiles:
            - $values/monitoring/grafana-cloud/beyla/values.yaml
        - repoURL: https://github.com/org/repo.git
          targetRevision: master
          ref: values

it looks like it think this is a git repo vs a helm repo

@audrey-mux
Copy link
Author

I also tried moving the chart to a git repo instead with a different outcome

      sources:
        - repoURL: https://github.com/org/helmchart.git
          path: tools/helm/beyla
          targetRevision: main
          helm:
            releaseName: beyla
            valueFiles:
            - $values/monitoring/grafana-cloud/beyla/values.yaml
        - repoURL: https://github.com/org/valuesfile.git
          targetRevision: master
          ref: values

I had to drop the values file in the helmchart repo otherwise kubechecks couldn't find it -

from what it looks like in the logs, the first repo (helmchart) is checked out and diffed, and then the second repo (valuesfile) is checked out but fails to get the helm chart since it doesn't exist at that path.

@djeebus
Copy link
Collaborator

djeebus commented Nov 1, 2024

Ah, yup, that makes sense. I took a quick stab at implementing it, but found that I didn't really understand what I was working in. I'm taking a step or two back so I can learn exactly what multisource apps are and all the different ways you can use and abuse them.

@djeebus
Copy link
Collaborator

djeebus commented Nov 8, 2024

OK, made a new strong attempt. want to give the new image a shot? It's still ghcr.io/zapier/kubechecks:0.0.0-pr298, but I'm feeling a lot more confident about the direction. One note: the image requires that it be run in the same cluster as argocd and have access to the argocd-repo-server service.

@audrey-mux
Copy link
Author

Awesome, i'll give it a go.

@audrey-mux
Copy link
Author

awww yeah, looks like that works. thanks for the heads up about the repo-server connection, had to make adjustments to the default argocd network policies but knowing that connection was required saved me a lot of time.

@djeebus
Copy link
Collaborator

djeebus commented Nov 10, 2024

Awesome! Yeah, gonna be fun to try and document this one. I think it leads to more consistent behavior, but it's also tightly couples the app to an internal service. Not sure how I feel about this, but the alternative is forking and copying code, which isn't pretty either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants