Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node-collector schedule prevention on aws fargate #3710

Closed
YevhenVieskov opened this issue Feb 28, 2023 · 5 comments · Fixed by #4459
Closed

node-collector schedule prevention on aws fargate #3710

YevhenVieskov opened this issue Feb 28, 2023 · 5 comments · Fixed by #4459
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. target/kubernetes Issues relating to kubernetes cluster scanning
Milestone

Comments

@YevhenVieskov
Copy link

YevhenVieskov commented Feb 28, 2023

Description

I have node-collector pods in pending state. They sceduled on AWS Fargate.
I have added nodeAffinity to kubernetes manifest to prevent pod scheduling on fargate

 ---
 apiVersion: helm.toolkit.fluxcd.io/v2beta1
 kind: HelmRelease
 metadata:
    name: trivy-operator
    namespace: trivy-system
 spec:
    chart:
       spec:
          chart: trivy-operator
          version: 0.11.1
          sourceRef:
             kind: HelmRepository
             name: aqua
             namespace: flux-system
          interval: 1m0s
          values:
             affinity:
                nodeAffinity:
                    requiredDuringSchedulingIgnoredDuringExecution:
                       nodeSelectorTerms:
                       - matchExpressions:
                         - key: eks.amazonaws.com/compute-type
                           operator: NotIn
                           values:
                           - fargate
              targetNamespaces: "mynamespace"
              operator:
                  metricsVulnIdEnabled: true
                  scannerReportTTL: "48h"
                  scanJobsConcurrentLimit: 5
              trivyOperator:
                 scanJobNodeSelector:
                    beta.kubernetes.io/instance-type: m5.large
              trivy:
                offlineScan: true

What did you expect to happen?

I expected node-collector pods would not sceduled to AWS Fargate

What happened instead?

node-collector pods are sceduled to AWS Fargate. They are in pending state.

How to prevent sceduling node-collector pods to AWS Fargate?

@YevhenVieskov YevhenVieskov added the kind/bug Categorizes issue or PR as related to a bug. label Feb 28, 2023
@knqyf263 knqyf263 added the target/kubernetes Issues relating to kubernetes cluster scanning label Feb 28, 2023
@chen-keinan
Copy link
Contributor

chen-keinan commented Mar 1, 2023

@YevhenVieskov A node toleration has been added to to node-collector job , it will be released with v0.38.0

tolerations := []corev1.Toleration{
		{
			Effect:   corev1.TaintEffectNoSchedule,
			Operator: corev1.TolerationOperator(corev1.NodeSelectorOpExists),
		},
		{
			Effect:   corev1.TaintEffectNoExecute,
			Operator: corev1.TolerationOperator(corev1.NodeSelectorOpExists),
		},
		{
			Effect:            corev1.TaintEffectNoExecute,
			Key:               "node.kubernetes.io/not-ready",
			Operator:          corev1.TolerationOperator(corev1.NodeSelectorOpExists),
			TolerationSeconds: pointer.Int64(300),
		},
		{
			Effect:            corev1.TaintEffectNoExecute,
			Key:               "node.kubernetes.io/unreachable",
			Operator:          corev1.TolerationOperator(corev1.NodeSelectorOpExists),
			TolerationSeconds: pointer.Int64(300),
		},

Please check it again with v0.38.0 and update if it solve the problem

@YevhenVieskov
Copy link
Author

YevhenVieskov commented Mar 7, 2023

I updated trivy-operator version.
I have label to fargate node
https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/fargate_profile/main.tf

 fargate_profiles = {
     default = {
         selectors = [
         {
             namespace = "fargate"
             labels = {
                Application = "backend"
             }
         }
         ]
      }
   }

I added label kubernetes.io/hostname: "ip-*" to scanJobNodeSelector .

https://artifacthub.io/packages/helm/trivy-operator/trivy-operator/

trivyOperator:
  scanJobNodeSelector:    
    kubernetes.io/hostname: "ip-*"

I can't create taints for fargate-profile in Terraform, how can I use
node toleration for node-collector job?

@github-actions
Copy link

github-actions bot commented May 7, 2023

This issue is stale because it has been labeled with inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. label May 7, 2023
@knqyf263 knqyf263 added lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. labels May 7, 2023
@chen-keinan
Copy link
Contributor

@YevhenVieskov how you define Taint on the node ?

@chen-keinan
Copy link
Contributor

Adding support for excluding the nodes by label

@chen-keinan chen-keinan added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. target/kubernetes Issues relating to kubernetes cluster scanning
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants