Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(discovery): plugin registration bugfixes #1650

Merged
merged 10 commits into from
Sep 13, 2023

Conversation

andrewazores
Copy link
Member

@andrewazores andrewazores commented Sep 1, 2023

Welcome to Cryostat! 👋

Before contributing, make sure you have:

  • Read the contributing guidelines
  • Linked a relevant issue which this PR resolves
  • Linked any other relevant issues, PR's, or documentation, if any
  • Resolved all conflicts, if any
  • Rebased your branch PR on top of the latest upstream main branch
  • Attached at least one of the following labels to the PR: [chore, ci, docs, feat, fix, test]
  • Signed all commits using a GPG signature

To recreate commits with GPG signature git fetch upstream && git rebase --force --gpg-sign upstream/main


Fixes: https://github.com/cryostatio/cryostat/issues/1633
See also cryostatio/cryostat-agent#193
Based on #1636

Description of the change:

Improves error handling and cleanup when plugin registrations fail and are associated with stored credentials.

Motivation for the change:

This along with cryostatio/cryostat-agent#193 increases the resiliency of the server/agent registration system so that temporary networking failures or registration conflicts or other bugs are less likely to leave the server and agent both in a state where neither recognizes the other and yet neither is able to clean up and reset the registration status.

How to manually test:

  1. Run CRYOSTAT_IMAGE=quay.io... sh smoketest.sh...
  2. Check that agent instances properly register & publish, and are visible in the Topology view
  3. podman kill some agent instances to prevent clean shutdown
  4. podman run (reference smoketest.sh for exact invocation) to restart some of the killed agent instances to spin them back up
  5. stop the smoketest, then restart it without clearing databases, and ensure that the agent instances are able to re-register. This may take a couple of minutes for the server to recognize that the old state is stale, clear it, and allow agents to register again.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Hi @andrewazores! Add at least one of the required labels to this PR

Required labels are : chore,ci,cleanup,docs,feat,fix,perf,refactor,style,test

@github-actions github-actions bot added the needs-triage Needs thorough attention from code reviewers label Sep 1, 2023
@mergify mergify bot added the safe-to-test label Sep 1, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Hi @andrewazores! Add at least one of the required labels to this PR

Required labels are : chore,ci,cleanup,docs,feat,fix,perf,refactor,style,test

@andrewazores
Copy link
Member Author

/request_review

@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2023

This PR/issue depends on:

@andrewazores
Copy link
Member Author

/build_test

@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2023

ARCH IMAGE
amd64 ghcr.io/cryostatio/cryostat:pr-1650-b15566908b022e3cd352af56e64550f2f3b748a6-linux-amd64
arm64 ghcr.io/cryostatio/cryostat:pr-1650-b15566908b022e3cd352af56e64550f2f3b748a6-linux-arm64

To run smoketest:

# amd64          
CRYOSTAT_IMAGE=ghcr.io/cryostatio/cryostat:pr-1650-b15566908b022e3cd352af56e64550f2f3b748a6-linux-amd64 sh smoketest.sh

# or arm64
CRYOSTAT_IMAGE=ghcr.io/cryostatio/cryostat:pr-1650-b15566908b022e3cd352af56e64550f2f3b748a6-linux-arm64 sh smoketest.sh

smoketest.sh Outdated Show resolved Hide resolved
Copy link
Member

@tthvo tthvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for testing Mergify config :))

@mergify mergify bot removed the review-requested label Sep 7, 2023
@aali309
Copy link
Contributor

aali309 commented Sep 7, 2023

Comments for testing Mergify config :))

Looks like this worked as expected here.

@andrewazores andrewazores merged commit 0197f7b into cryostatio:main Sep 13, 2023
8 checks passed
@andrewazores andrewazores deleted the gh1633 branch September 13, 2023 13:53
andrewazores added a commit that referenced this pull request Sep 13, 2023
* fix(discovery): delete plugin stored credentials automatically on deregistration/stale prune

* delete any stored credentials on plugin callback ping failure

* bump minimum event loop pool size

* use more specific response code for duplicate matchexpression
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Bug] Agent discovery desync
4 participants