Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate kettle to k8s-infra #787

Closed
1 task done
spiffxp opened this issue Apr 23, 2020 · 33 comments
Closed
1 task done

Migrate kettle to k8s-infra #787

spiffxp opened this issue Apr 23, 2020 · 33 comments
Assignees
Labels
area/infra Infrastructure management, infrastructure design, code in infra/ lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@spiffxp
Copy link
Member

spiffxp commented Apr 23, 2020

Part of migrating away from gcp-project k8s-gubernator: #1308

My suggestions for target:

  • project: kubernetes-public
  • cluster: aaa
  • namespace: kettle

/wg k8s-infra
/area cluster-infra
/sig testing

@k8s-ci-robot k8s-ci-robot added wg/k8s-infra area/infra Infrastructure management, infrastructure design, code in infra/ sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Apr 23, 2020
@spiffxp spiffxp added this to Needs Triage in sig-k8s-infra via automation Apr 28, 2020
@spiffxp spiffxp moved this from Needs Triage to Backlog in sig-k8s-infra Apr 28, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 22, 2020
@spiffxp
Copy link
Member Author

spiffxp commented Jul 30, 2020

/remove-lifecycle stale
Kettle is still running there

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 30, 2020
@spiffxp
Copy link
Member Author

spiffxp commented Jul 30, 2020

Migrating kettle most likely looks something like

@spiffxp
Copy link
Member Author

spiffxp commented Jul 30, 2020

FYI @MushuEE
given that you've been modifying kettle lately, if you happen see things that could help inform a plan for this, drop 'em here

@MushuEE
Copy link
Contributor

MushuEE commented Jul 31, 2020

When you say

migrate the bigquery database kettle writes to

is that to a new project? What is the:
target project and target cluster?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 29, 2020
@spiffxp
Copy link
Member Author

spiffxp commented Nov 3, 2020

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 3, 2020
@spiffxp spiffxp moved this from Backlog to Backlog (infra to migrate) in sig-k8s-infra Jan 20, 2021
@spiffxp spiffxp added this to the v1.21 milestone Jan 21, 2021
@spiffxp
Copy link
Member Author

spiffxp commented Jan 21, 2021

/assign @MushuEE @spiffxp
to investigate possible approaches

@spiffxp spiffxp added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jan 22, 2021
@spiffxp spiffxp added this to To Triage in sig-testing issues Feb 9, 2021
ameukam added a commit to ameukam/k8s.io that referenced this issue Mar 31, 2021
Ref: kubernetes#787

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@ameukam ameukam changed the title Migrate kettle to wg-k8s-infra Migrate kettle to k8s-infra Jan 28, 2024
@BenTheElder BenTheElder added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Apr 1, 2024
@BenTheElder
Copy link
Member

I don't have much access to the k8s-gubernator project1 currently, so it's a bit difficult to do much to help here.
I'm going to ask around about access.

The deployment details are more or less in the repo at least https://github.com/kubernetes/test-infra/blob/master/kettle/.

Footnotes

  1. https://github.com/kubernetes/test-infra/blob/977565d850778323a56f000343d27885920d32a1/kettle/Makefile#L23

@BenTheElder
Copy link
Member

BenTheElder commented Apr 1, 2024

@ixdy still works at Google and still had access, despite long since not working in cloud anymore ... myself and @liggitt now have owner access to k8s-gubernator project for continuity until we can migrate it. Thanks Jeff!

This still needs to happen before the prow default cluster shutdown in August and sooner is better.

@BenTheElder
Copy link
Member

So we still have one "g8r" cluster on 1.26.11-gke.1055000 with 3 node pools, "pool-1" (e2-highmem-16, 1 node), "pool-highmem" (n1-highmem-8, 2 nodes), "pool-large" (n1-standard-8, 0 nodes).

It is running "kettle" and "kettle-staging" deployments with one pod each.

Each of those has a PD-SSD, 3001 and 201 GB respectively.

There are some bigquery datasets in this project, build/all is 1.67 TB.

@BenTheElder
Copy link
Member

Given initially ingest this data from the prow GCS logs, I think we should probably look at cold-starting a new instance running in AAA, just overriding the cluster/project and deploying with the existing tooling.

There's a lot to be desired around auto deployment etc however

@BenTheElder
Copy link
Member

I think @dims has this working, one remaining item will be when we're confident this is done let Googlers know and we'll see about turning down the old instance / GCP project ... (FYI @michelle192837 @cjwagner)

@dims
Copy link
Member

dims commented Apr 19, 2024

@BenTheElder i want to watch it for a week before we can call it done!

@michelle192837
Copy link
Contributor

Exciting stuff! :D Thanks y'all!

@BenTheElder
Copy link
Member

[I scaled the old cluster down to zero this week, we'll check back next week]

@dims
Copy link
Member

dims commented Apr 30, 2024

thanks @BenTheElder

@dims
Copy link
Member

dims commented May 2, 2024

https://storage.googleapis.com/k8s-triage/index.html is being updated.

and the flakes json looks good as well

❯ gsutil ls -l gs://k8s-metrics


Updates are available for some Google Cloud CLI components.  To install them,
please run:
  $ gcloud components update

       114  2024-05-02T00:05:31Z  gs://k8s-metrics/build-stats-latest.json
     10040  2024-05-02T00:04:51Z  gs://k8s-metrics/failures-latest.json
    103224  2024-05-02T00:04:20Z  gs://k8s-metrics/flakes-daily-latest.json
    204024  2024-05-02T00:05:48Z  gs://k8s-metrics/flakes-latest.json
         5  2024-05-02T00:04:09Z  gs://k8s-metrics/job-flakes-latest.json
    376585  2024-05-02T00:05:08Z  gs://k8s-metrics/job-health-latest.json
         3  2024-05-02T00:05:20Z  gs://k8s-metrics/pr-consistency-latest.json
     83496  2024-05-02T00:04:36Z  gs://k8s-metrics/presubmit-health-latest.json
         3  2024-05-02T00:06:01Z  gs://k8s-metrics/weekly-consistency-latest.json
                                 gs://k8s-metrics/build-stats/
                                 gs://k8s-metrics/failures/
                                 gs://k8s-metrics/flakes-daily/
                                 gs://k8s-metrics/flakes/
                                 gs://k8s-metrics/istio-job-flakes/
                                 gs://k8s-metrics/job-flakes/
                                 gs://k8s-metrics/job-health/
                                 gs://k8s-metrics/pr-consistency/
                                 gs://k8s-metrics/presubmit-health/
                                 gs://k8s-metrics/weekly-consistency/
TOTAL: 9 objects, 777494 bytes (759.27 KiB)

We can turn down the old cluster early next week @BenTheElder

@BenTheElder
Copy link
Member

SGTM. At some point I'd like to turn down the bigquery datasets and anything else lingering in that project as well.

@BenTheElder
Copy link
Member

/assign
Will plan to turn down and delete everything in the old project this week.

@ameukam
Copy link
Member

ameukam commented May 7, 2024

/assign Will plan to turn down and delete everything in the old project this week.

@BenTheElder also update #1308 and close it ? 🥺

sig-k8s-infra automation moved this from In Progress to Done May 14, 2024
sig-testing issues automation moved this from In Progress to Done May 14, 2024
@BenTheElder
Copy link
Member

remaining follow up will be tracked in #1308

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/infra Infrastructure management, infrastructure design, code in infra/ lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
Status: Done
sig-k8s-infra
  
Done
Development

No branches or pull requests

9 participants