Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FLP-based deduper options #591

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

jotak
Copy link
Member

@jotak jotak commented Mar 19, 2024

Description

FLP-based dedup allows to decrease Loki CPU / memory / storage a lot (~50%) at the cost of minimal loss in data accuracy (e.g. loosing interfaces involved in egress traffic)

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Copy link

openshift-ci bot commented Mar 19, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link

openshift-ci bot commented Mar 19, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jotak. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

const (
FLPDeduperDisabled FLPDeduperMode = "Disabled"
FLPDeduperDrop FLPDeduperMode = "Drop"
FLPDeduperSample FLPDeduperMode = "Sample"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpinsonneau a possibility could be to add a "Merge" mode here that would involve infinispan like in your PoC

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good; but should we offer all of these or just support one or two modes in the end ?

Copy link
Member Author

@jotak jotak Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see pros and cons on every mode and no clear "winner"

  • Disabled to make sure we get every flow / not loosing anything
  • Infinispan-based is similar but with performance impacts (positive on Loki, negative on FLP) and adds a new component, so it requires more configuration
  • Drop is the best for overall performance but looses data
  • Sample offers a compromise between Drop and Disabled, providing statistical samples of dropped flows

Copy link

codecov bot commented Mar 19, 2024

Codecov Report

Attention: Patch coverage is 29.07801% with 100 lines in your changes are missing coverage. Please review.

Project coverage is 66.63%. Comparing base (efc42ca) to head (5e7bbbd).
Report is 4 commits behind head on main.

Files Patch % Lines
controllers/flp/flp_pipeline_builder.go 30.76% 44 Missing and 1 partial ⚠️
...s/flowcollector/v1beta1/zz_generated.conversion.go 8.33% 22 Missing ⚠️
...pis/flowcollector/v1beta1/zz_generated.deepcopy.go 0.00% 14 Missing ⚠️
...pis/flowcollector/v1beta2/zz_generated.deepcopy.go 0.00% 13 Missing and 1 partial ⚠️
pkg/helper/flowcollector.go 66.66% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #591      +/-   ##
==========================================
- Coverage   67.38%   66.63%   -0.75%     
==========================================
  Files          65       65              
  Lines        7987     8081      +94     
==========================================
+ Hits         5382     5385       +3     
- Misses       2276     2365      +89     
- Partials      329      331       +2     
Flag Coverage Δ
unittests 66.63% <29.07%> (-0.75%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@openshift-merge-robot
Copy link
Collaborator

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants