Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repo-server has silent unmarshalling errors leading to empty applications #4423

Closed
jessesuen opened this issue Sep 24, 2020 · 3 comments
Closed
Assignees
Labels
bug/priority:high Should be fixed in the next patch release bug Something isn't working
Milestone

Comments

@jessesuen
Copy link
Member

jessesuen commented Sep 24, 2020

Describe the bug

We noticed a situation where when retrieving cached manifests from redis, the cache silently returned an empty datastructure of cached manifests, despite there being content in the redis key. This made it appear that the manifests at the commit SHA had zero manifests, and Argo CD wanted to prune everything.

When investigating the issue, the problem was traced to the github.com/vmihailenco/msgpack library. When connecting directly to redis, it was seen that there was partial text, so there was indeed content in the database for the key. However, when msgpack deserialize it, it returned an empty struct without returning an error, which was interpreted by the repo server as having no manifests.

To Reproduce

This is not easy to reproduce.

Expected behavior

I think we may need to add some corruption verification, or at least a basic struct check that validates what we retrieve from redis is valid. For example, it could be a checksum against the manifests.

Version

v1.7

@jessesuen jessesuen added bug Something isn't working bug/priority:high Should be fixed in the next patch release labels Sep 24, 2020
@jessesuen jessesuen added this to the v1.8 milestone Sep 24, 2020
@jessesuen
Copy link
Member Author

For example, it could be a checksum against the manifests.

@alexmt and I agree we should introduce a new field in the cached manifest datastructure which contains a checksum of the rendered JSON manifests. This checksum would be verified whenever we retrieve manifests it from cache.

/cc @jgwest this might be a good one for you

@jgwest
Copy link
Member

jgwest commented Sep 29, 2020

@jessesuen +1, feel free to assign to me.

jgwest added a commit to jgwest/argo-cd that referenced this issue Oct 29, 2020
jgwest added a commit to jgwest/argo-cd that referenced this issue Oct 29, 2020
alexmt pushed a commit that referenced this issue Nov 2, 2020
…lications (#4423) (#4708)

* fix: Repo-server has silent unmarshalling errors leading to empty applications (#4423)
terrycorley pushed a commit to terrycorley/argo-cd that referenced this issue Nov 3, 2020
…zation-generators

* 'master' of github.com:argoproj/argo-cd:
  fix: RevisionFormField component crashes in 'refs' API returns no tags (argoproj#4735)
  docs: add Opensurvey to USERS.md (argoproj#4727)
  docs: correct parameters usage in CLI (argoproj#4725)
  fix: Repo-server has silent unmarshalling errors leading to empty applications (argoproj#4423) (argoproj#4708)
  fix: inject artificial delay between sync waves to better support health assessments (argoproj#4715)
  fix: exclude files listed under exclusions (argoproj#4686)
  feat: support resource actions on CRDs that use status subresources (argoproj#4690)
  feat: Add autocomplete for repo Revisions (argoproj#4645) (argoproj#4713)
  fix: webhook don't refresh apps pointing to HEAD (argoproj#4717)
  feat: Add support for ExecProvider cluster auth (argoproj#4600) (argoproj#4710)
  fix: adding helm values file in New App (argoproj#4635)
  docs: Instructions on `make verify-kube-connect` step when using k3d (argoproj#4687)
  feat:  Annotation based app paths detection in webhooks (argoproj#4699)
  fix: adding commonAnnotations for Kustomize (argoproj#4613)
@alexmt
Copy link
Collaborator

alexmt commented Nov 3, 2020

Implemented by #4708

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug/priority:high Should be fixed in the next patch release bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants