From 41b94046f8e4a842ee319c4234d177562bcd85fe Mon Sep 17 00:00:00 2001 From: soulchips Date: Wed, 26 Jun 2024 12:02:21 -0400 Subject: [PATCH 1/4] adding note about runner GC on task pod --- jekyll/_cci2/container-runner.adoc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/jekyll/_cci2/container-runner.adoc b/jekyll/_cci2/container-runner.adoc index 16ac4557f89..7cea96fd852 100644 --- a/jekyll/_cci2/container-runner.adoc +++ b/jekyll/_cci2/container-runner.adoc @@ -390,6 +390,8 @@ NOTE: Cluster-wide permissions are used by container runner to autodetect the OS Each container runner has a garbage collector which will ensure any pods and secrets with the label `app.kubernetes.io/managed-by=circleci-container-agent` left dangling in the cluster are removed. By default this will remove all jobs older than five hours and five minutes. This can be shortened or lengthened via the `agent.kubeGCThreshold` parameter. However, if you do shorten the garbage collection (GC) frequency, also shorten the max task run time via the `agent.maxRunTime` parameter to be a value smaller than the new GC frequency. Otherwise a running task pod could be removed by the GC. +In addition, GC may remove some objects sooner than this threshold. Task pods have a liveness probe which checks for a running task-agent process. Once a task completes or fails, the task-agent process will stop running and the liveness probe will fail which will trigger GC. + Container runner will drain and restart cleanly when sent a termination signal. Container runner will not automatically attempt to launch a task that fails to start. This can be done in the CircleCI web app. If the container runner crashes, there is no expectation that in-process or queued tasks are handled gracefully. From fc26092afbdc80dec0f1c732997dc6cb0c52ed72 Mon Sep 17 00:00:00 2001 From: Akil Aikman Date: Thu, 27 Jun 2024 15:24:48 -0400 Subject: [PATCH 2/4] Update container-runner.adoc --- jekyll/_cci2/container-runner.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/jekyll/_cci2/container-runner.adoc b/jekyll/_cci2/container-runner.adoc index 7cea96fd852..568de41dec3 100644 --- a/jekyll/_cci2/container-runner.adoc +++ b/jekyll/_cci2/container-runner.adoc @@ -388,7 +388,7 @@ NOTE: Cluster-wide permissions are used by container runner to autodetect the OS [#garbage-collection] == Garbage collection -Each container runner has a garbage collector which will ensure any pods and secrets with the label `app.kubernetes.io/managed-by=circleci-container-agent` left dangling in the cluster are removed. By default this will remove all jobs older than five hours and five minutes. This can be shortened or lengthened via the `agent.kubeGCThreshold` parameter. However, if you do shorten the garbage collection (GC) frequency, also shorten the max task run time via the `agent.maxRunTime` parameter to be a value smaller than the new GC frequency. Otherwise a running task pod could be removed by the GC. +Each container runner has a garbage collector which will ensure any pods and secrets with the label `app.kubernetes.io/managed-by=circleci-container-agent` left dangling in the cluster are removed. By default this will remove all jobs older than five hours and five minutes. This can be shortened or lengthened via the `agent.gc.threshold` parameter. However, if you do shorten the garbage collection (GC) frequency, also shorten the max task run time via the `agent.maxRunTime` parameter to be a value smaller than the new GC frequency. Otherwise a running task pod could be removed by the GC. In addition, GC may remove some objects sooner than this threshold. Task pods have a liveness probe which checks for a running task-agent process. Once a task completes or fails, the task-agent process will stop running and the liveness probe will fail which will trigger GC. From 5a1d97d384a6236f546934d04e96656ed926e782 Mon Sep 17 00:00:00 2001 From: Akil Aikman Date: Fri, 5 Jul 2024 08:21:55 -0400 Subject: [PATCH 3/4] Update jekyll/_cci2/container-runner.adoc Co-authored-by: Rosie Yohannan --- jekyll/_cci2/container-runner.adoc | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/jekyll/_cci2/container-runner.adoc b/jekyll/_cci2/container-runner.adoc index 568de41dec3..654edf826ae 100644 --- a/jekyll/_cci2/container-runner.adoc +++ b/jekyll/_cci2/container-runner.adoc @@ -388,7 +388,9 @@ NOTE: Cluster-wide permissions are used by container runner to autodetect the OS [#garbage-collection] == Garbage collection -Each container runner has a garbage collector which will ensure any pods and secrets with the label `app.kubernetes.io/managed-by=circleci-container-agent` left dangling in the cluster are removed. By default this will remove all jobs older than five hours and five minutes. This can be shortened or lengthened via the `agent.gc.threshold` parameter. However, if you do shorten the garbage collection (GC) frequency, also shorten the max task run time via the `agent.maxRunTime` parameter to be a value smaller than the new GC frequency. Otherwise a running task pod could be removed by the GC. +Each container runner has a garbage collector. The garbage collector ensures the removal of any pods and secrets with the label `app.kubernetes.io/managed-by=circleci-container-agent` that are left dangling in the cluster. By default, the garbage collector removes all jobs older than five hours and five minutes. This time limit can be shortened or lengthened via the `agent.gc.threshold` parameter. However, if you do shorten the garbage collection frequency, you must also shorten the maximum task run time via the `agent.maxRunTime` parameter to be a value smaller than the new garbage collection frequency. + +CAUTION: If you change the garbage collection threshold but do **not** keep the max task run time lower than the garbage collection frequency, a running task pod could be removed by the garbage collector. In addition, GC may remove some objects sooner than this threshold. Task pods have a liveness probe which checks for a running task-agent process. Once a task completes or fails, the task-agent process will stop running and the liveness probe will fail which will trigger GC. From 9b533321598243fac92185f23d38003f8bac3c32 Mon Sep 17 00:00:00 2001 From: Akil Aikman Date: Fri, 5 Jul 2024 08:22:33 -0400 Subject: [PATCH 4/4] Update jekyll/_cci2/container-runner.adoc Co-authored-by: Rosie Yohannan --- jekyll/_cci2/container-runner.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/jekyll/_cci2/container-runner.adoc b/jekyll/_cci2/container-runner.adoc index 654edf826ae..6cc4ba2c4c3 100644 --- a/jekyll/_cci2/container-runner.adoc +++ b/jekyll/_cci2/container-runner.adoc @@ -392,7 +392,7 @@ Each container runner has a garbage collector. The garbage collector ensures the CAUTION: If you change the garbage collection threshold but do **not** keep the max task run time lower than the garbage collection frequency, a running task pod could be removed by the garbage collector. -In addition, GC may remove some objects sooner than this threshold. Task pods have a liveness probe which checks for a running task-agent process. Once a task completes or fails, the task-agent process will stop running and the liveness probe will fail which will trigger GC. +The garbage collector may remove some objects sooner than the threshold. Task pods have a liveness probe that checks for a running task-agent process. Once a task completes or fails, the task-agent process will stop running and the liveness probe will fail, which will trigger GC. Container runner will drain and restart cleanly when sent a termination signal. Container runner will not automatically attempt to launch a task that fails to start. This can be done in the CircleCI web app.