Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROX-18431: restore deleted tenant through API #1207

Merged
merged 22 commits into from
Sep 12, 2023

Conversation

johannes94
Copy link
Contributor

@johannes94 johannes94 commented Aug 17, 2023

Description

This PR adds an API endpoint to the admin API of fleetmanager that allows to restore a deleted Central.

The restore operation resets the values of a deleted central in the database that are filled during the provisioning and deletion state machine flow. Then it resets it to provisioning and lets fleetshard-sync recreate the tenant. Additional logic in fleetshard-sync will search for RDS snapshots associated with cluster-id and restore the DB from a final snapshot if it exists.

The restore operation is currently limited, as it only restores the database to a state where the db password and master password needs to be restored. In order to get to a working tenant one has to follow: https://gitlab.cee.redhat.com/stackrox/acs-cloud-service/runbooks/-/blob/3c22117db0652ff52d54179038f625164d352f15/sops/dp-024-lost-tenant-db-password.md#L1

This PR will fix that: #1197

Checklist (Definition of Done)

  • Unit and integration tests added
  • Added test description under Test manual
  • [ ] Documentation added if necessary (i.e. changes to dev setup, test execution, ...) Docs will be updated in SOP repo.
  • CI and all relevant tests are passing
  • Add the ticket number to the PR title if available, i.e. ROX-12345: ...
  • Discussed security and business related topics privately. Will move any security and business related topics that arise to private communication channel.
  • Add secret to app-interface Vault or Secrets Manager if necessary
  • RDS changes were e2e tested manually
  • Check AWS limits are reasonable for changes provisioning new resources

Test manual

Tested on an OSD cluster with all integrations (Route53, AWS RDS Database, RH SSO)

# Populate RH SSO dynamic client secrets, you can get them from app-interface vault of stage
# redhatsso-service.clientId and redhatsso-service.clientSecret files
# Populate Route53 secrets for dev, you can get them from OSCI vault
# aws.route53accesskey $aws.route53secretaccesskey
# Start fleetmanager with external DNS and your cluster config

make binary
make db/teardown db/setup db/migrate

./fleet-manager serve --enable-central-external-certificate --dataplane-cluster-config-file ./dev/config/dataplane-cluster-configuration-infractl-osd.yaml --force-leader --central-idp-client-id ""

# Start fleetshard-sync in another terminal
# Prepare environment
export CLUSTER_NAME=local_cluster                 
export MANAGED_DB_ENABLED
export MANAGED_DB_ENABLED=true
export AWS_AUTH_HELPER=aws-saml
export CREATE_AUTH_PROVIDER=true

# Start fleetshard-sync
./dev/env/scripts/exec_fleetshard_sync.sh

# In another terminal create an central instance
./scripts/create-central.sh

# Wait for central to be provisioned
# With above setup I included all integrations. Route53, RH SSO, RDS so it will take a while
# Once central is up login via Web Browser
# Create an API token in the integration tab of central
# We're going to use this token to ensure DB restore was successful later on so store it in a file
echo $token > token.txt

# Now delete the central tenant and wait for it's deprovisioning flow to complete
export central_id=<central-id>
export OCM_TOKEN=$(ocm token)
./scripts/fmcurl "rhacs/v1/centrals/$central_id?async=true" -XDELETE -v

# While waiting get an admin token
rhoas login --auth-url=https://auth.redhat.com/auth/realms/EmployeeIDP 
export OCM_TOKEN=$(rhoas authtoken)

# Once the deletion went through restore it through admin API
./scripts/fmcurl "rhacs/v1/admin/centrals/$central_id/restore" -XPOST -v

# Monitor restore operation, central will start in preparing and should go to ready. Through FM<>FS interaction.
# Once it's ready call the UI again and Login to test it's working
# Then as a final test use the API token
export ROX_API_TOKEN=(cat token.txt)
export ROX_ENDPOINT=acs-<central-id>.rhacs-dev.com:443
roxctl central whoami


# To run tests locally run:
make db/teardown db/setup db/migrate
make ocm/setup OCM_OFFLINE_TOKEN=<ocm-offline-token> OCM_ENV=development
make verify lint binary test test/integration

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 17, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@johannes94 johannes94 temporarily deployed to development August 17, 2023 13:50 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 17, 2023 13:50 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 11:10 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 11:10 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 11:34 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 11:34 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 11:39 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 11:39 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 12:30 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 12:30 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 12:30 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 12:33 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 12:33 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 12:33 — with GitHub Actions Inactive
@johannes94 johannes94 marked this pull request as ready for review August 29, 2023 13:42
@johannes94 johannes94 temporarily deployed to development August 29, 2023 13:42 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 13:42 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development August 29, 2023 13:42 — with GitHub Actions Inactive
@johannes94 johannes94 force-pushed the jmalsam/ROX-18431-restore-deleted-tenant-api branch from de505af to 9b6968e Compare September 5, 2023 06:41
@johannes94 johannes94 temporarily deployed to development September 5, 2023 06:41 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development September 7, 2023 14:04 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development September 7, 2023 14:04 — with GitHub Actions Inactive
@johannes94 johannes94 force-pushed the jmalsam/ROX-18431-restore-deleted-tenant-api branch from 2e2d8bb to b3a23c7 Compare September 12, 2023 07:04
@johannes94 johannes94 temporarily deployed to development September 12, 2023 07:04 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development September 12, 2023 07:04 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development September 12, 2023 07:04 — with GitHub Actions Inactive
@johannes94 johannes94 merged commit c6a2ebc into main Sep 12, 2023
8 checks passed
@johannes94 johannes94 deleted the jmalsam/ROX-18431-restore-deleted-tenant-api branch September 12, 2023 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants