Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use system-wide configmap instead of configuration via cli args #446

Closed
pmengelbert opened this issue Oct 28, 2022 · 7 comments · Fixed by #581
Closed

Use system-wide configmap instead of configuration via cli args #446

pmengelbert opened this issue Oct 28, 2022 · 7 comments · Fixed by #581
Labels
enhancement New feature or request
Milestone

Comments

@pmengelbert
Copy link
Contributor

Describe the solution you'd like
The management of CLI args is getting complex and cluttered. A configmap should be implemented to replace some of the CLI flags for the manager, for example --collector-pull-policy, --collector-arg, etc. Over time this will get very cluttered, and managing them using the helm chart is getting more and more difficult.

Because options for scanner/eraser/collector containers are specified in code (and thus harder to template), they are currently passed in to cli flags on the manger.

A well-structured and well-documented configmap should be implemented. The manager should read this configuration at boot, but it will also have to be configured to respond to updates of the configmap if it is updated by the user while the manager is already deployed.

  • Eraser version: v0.4.0
@pmengelbert
Copy link
Contributor Author

A rough draft of what the configmap could look like:

# eraser-config.yml
---
eraser:
  image: string
  pullPolicy: corev1.PullPolicy
  runtime: string, "containerd" | "dockershim" | "cri-o" # default "containerd"
  imagelist: string # points to imagelist used for deletion; set to empty when using collector
  profile:
    enable: bool
    port: int32
collector:
  image: string
  pullPolicy: corev1.PullPolicy
  runtime: string, "containerd" | "dockershim" | "cri-o" # default "containerd"
  scanDisabled: bool
  profile:
    enable: bool
    port: int32
scanner:
  image: string
  pullPolicy: corev1.PullPolicy
  profile:
    enable: bool
    port: int32
  cpu:
    request: string
    limit: string
  memory:
    request: string
    limit: string
  scanOptions:
    ignoreUnfixed: bool
    deleteIfScanFailed: bool
    securityChecks:
      - string
      - string
    severities:
      - string
      - string
    vulnerabilityTypes:
      - string
      - string

@pmengelbert
Copy link
Contributor Author

pmengelbert commented Nov 1, 2022

A rough proposal of how these values would propagate to their respective components:

  • Before creating an ImageJob, the ImageJob controller would check the eraser-config (name tbd) configmap in the same namespace as the manager pod.
  • The ImageJob controller would use the configmap to set the image and pullPolicy pieces of the containers it specifies in code.
  • The rest of the values will be propagated to the eraser, collector, and canner containers by mounting the configmap into a well-known path for each container.
  • The eraser, collector, scanner containers have a configuration struct that matches the schema presented above. It is initialized first with default values.
  • After initializing the default config struct, each container will then check the well-known path for a config file. If present, those values will override the default values in the config struct.
  • If values are optional in the configmap, then a value which is not present will mean "use the default value".
  • In the documentation, values will be clearly labeled as required or optional
  • The default values will be well documented along with the schema
  • All pull requests that modify the schema will have a blocking requirement of updating the documentation.

@pmengelbert
Copy link
Contributor Author

pmengelbert commented Nov 1, 2022

The ImageJob Controller need not reconcile on updates to the configmap. The configmap will simply be read once before creating the collector pods or the eraser pods.

@pmengelbert pmengelbert changed the title feat: System-wide configmap instead of configuration via cli args Use system-wide configmap instead of configuration via cli args Nov 1, 2022
@sozercan
Copy link
Member

sozercan commented Nov 1, 2022

https://book.kubebuilder.io/component-config-tutorial/define-config.html is it possible to use this? i am guessing not since this is on the manager level?

@pmengelbert
Copy link
Contributor Author

I'll have to take a closer look, but it may be possible to use that. If possible, it would be preferable because it's a versioned schema. Looks like you can create a custom config type.

@pmengelbert
Copy link
Contributor Author

In the course of PR #555, it came up that we should use this configmap to enable the setting of cpu limits/requests on the collector/scanner/eraser pods: #555 (comment)

@pmengelbert
Copy link
Contributor Author

pmengelbert commented Jan 9, 2023

Open questions

  • How to propagate values to container: mount the whole configmap as a file?
  • Integrate exclusion list configmap, or keep separate?

Proposed Schema

---
runtime: containerd
otlpEndpoint: ""
logLevel: info
scheduling:
  repeatInterval: 24h # to be parsed into time.Duration
  beginImmediately: true
profile:
  enable: false
  port: 6060
imageJob:
  successRatio: 1.0 # float; ok with YAML?
  cleanup:
    delayOnSuccess: 0s # to be parsed into time.Duration
    delayOnFailure: 1d
pullSecrets: [] # image pull secrets for collector/scanner/eraser
nodeFilter:
  type: exclude # must be either exclude|include
  selectors:
    - eraser.sh/cleanup.filter
components:
  collector:
    enable: false
    image:
      repo: ghcr.io/azure/eraser/collector
      tag: latest
    request:
      cpu: 1000m
      mem: 500Mi
    limit:
      cpu: 1500m
      mem: 2Gi
  scanner:
    enable: false
    image:
      repo: ghcr.io/azure/eraser/trivy-scanner # supply custom image for custom scanner
      tag: latest
    request:
      cpu: 1000m
      mem: 500Mi
    limit:
      cpu: 1500m
      mem: 2Gi
    # The config needs to be passed through to the scanner as yaml, as a
    # single string. Because we allow custom scanner images, the scanner is
    # responsible for defining a schema, parsing, and validating.
    config: |
      # this is the schema for the default 'trivy-scanner' we should document
      # this because most users will probably be using the default scanner.
      cacheDir: /var/lib/trivy
      dbRepo: ghcr.io/aquasecurity/trivy-db
      deleteFailedImages: true
      vulnerabilities:
        ignoreUnfixed: true
        types:
          - os
          - library
      securityChecks: # need to be documented; determined by trivy, not us
        - vuln
      severities:
        - CRITICAL
  eraser:
    image:
      repo: ghcr.io/azure/eraser/eraser
      tag: latest
    request:
      cpu: 1000m
      mem: 500Mi
    limit:
      cpu: 1500m
      mem: 2Gi

Checklist of CLI args

manager:

  • [n/a] -collector-arg value
  • -collector-image string
  • [leave it alone] -config string
  • -delete-scan-failed-images
  • -enable-pprof
  • [remove] -eraser-arg value
  • -eraser-image string
  • -filter-nodes string
  • -filter-nodes-selector value
  • [omitted for now] -health-probe-bind-address string
  • -job-cleanup-on-error-delay duration
  • -job-cleanup-on-success-delay duration
  • -job-success-ratio float
  • [??? kubebuilder] -kubeconfig string
  • [??? kubebuilder] -leader-elect
  • -log-level string
  • [??? kubebuilder] -metrics-bind-address string
  • -otlp-endpoint string
  • -pprof-port int
  • -repeat-period duration
  • [n/a] -scanner-arg value
  • -scanner-cpu-limit string
  • -scanner-cpu-request string
  • -scanner-image string
  • -scanner-mem-limit string
  • -scanner-mem-request string
  • -schedule-immediate

collector:

  • [propagated] -enable-pprof
  • [??? kubebuilder] -kubeconfig string
  • [propagated] -log-level string
  • [propagated] -pprof-port int
  • [propagated] -runtime string
  • [propagated] -scan-disabled

trivy-scanner:

  • -cache-dir string
  • -db-repository string
  • -delete-scan-failed-images
  • [propagated] -enable-pprof
  • -ignore-unfixed
  • [??? kubebuilder] -kubeconfig string
  • [propagated] -log-level string
  • [propagated] -pprof-port int
  • [should be removed] -rekor-url string
  • -security-checks string
  • -severity string
  • -vuln-type string

eraser:

  • [propagated] -enable-pprof
  • [propagated] -imagelist string
  • [??? kubebuilder] -kubeconfig string
  • [propagated] -log-level string
  • [propagated] -pprof-port int
  • [propagated] -runtime string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants