Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limit memory used by FLP #376

Merged
merged 7 commits into from
Feb 8, 2023
Merged

Conversation

KalmanMeth
Copy link
Collaborator

@KalmanMeth KalmanMeth commented Feb 5, 2023

The memory usage of FLP seems to grow linearly with the number of concurrent connections being handled at the same time.
For encode_prom, state is saved (both in FLP and in the prometheus Go client) for each metric/labels combination that is reported. This results in memory growth that is linear with respect to the number of connections currently identified as active. In a configuration with limited memory resources it is desirable to instruct FLP to limit the amount of memory usage.
The tradeoff would be to not report on new connections once the memory limit is hit.

The conntrack stage memory usage also grows with the number of connections and should have the possibility of limiting the number of connections tracked.

@openshift-ci
Copy link

openshift-ci bot commented Feb 5, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from kalmanmeth by writing /assign @kalmanmeth in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@KalmanMeth KalmanMeth linked an issue Feb 5, 2023 that may be closed by this pull request
@codecov-commenter
Copy link

codecov-commenter commented Feb 5, 2023

Codecov Report

Merging #376 (23d38f3) into main (bb058e1) will increase coverage by 0.19%.
The diff coverage is 88.05%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main     #376      +/-   ##
==========================================
+ Coverage   61.71%   61.91%   +0.19%     
==========================================
  Files          91       91              
  Lines        5835     5873      +38     
==========================================
+ Hits         3601     3636      +35     
- Misses       2014     2016       +2     
- Partials      220      221       +1     
Flag Coverage Δ
unittests 61.91% <88.05%> (+0.19%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/api/conntrack.go 87.00% <ø> (ø)
pkg/api/encode_prom.go 100.00% <ø> (ø)
pkg/pipeline/encode/encode_prom.go 77.77% <66.66%> (+0.41%) ⬆️
pkg/test/utils.go 73.91% <81.81%> (+1.25%) ⬆️
pkg/pipeline/extract/aggregate/aggregates.go 90.19% <100.00%> (ø)
pkg/pipeline/extract/conntrack/conntrack.go 92.94% <100.00%> (+0.17%) ⬆️
pkg/pipeline/utils/timed_cache.go 95.74% <100.00%> (+0.14%) ⬆️
... and 1 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Collaborator

@ronensc ronensc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pkg/pipeline/encode/prom_cache_test.go Outdated Show resolved Hide resolved
pkg/pipeline/utils/timed_cache.go Show resolved Hide resolved
@openshift-ci openshift-ci bot removed the lgtm label Feb 6, 2023
@KalmanMeth KalmanMeth changed the title add maxMetrics to encode_prom limit memory used by FLP Feb 7, 2023
if (ct.config.MaxConnectionsTracked > 0) && (ct.config.MaxConnectionsTracked <= ct.connStore.mom.Len()) {
log.Warningf("too many connections; skipping flow log %v: ", fl)
ct.metrics.inputRecords.WithLabelValues("discarded").Inc()
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This continue will skip also over the if ct.shouldOutputFlowLogs which I think should be executed:

if ct.shouldOutputFlowLogs {
record := fl.Copy()
addHashField(record, computedHash.hashTotal)
addTypeField(record, api.ConnTrackOutputRecordTypeName("FlowLog"))
outputRecords = append(outputRecords, record)
ct.metrics.outputRecords.WithLabelValues("flowLog").Inc()
}

One option would be to bring that "if" before if !exists. Maybe you could find a better solution.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated code to address suggestion.

@@ -67,6 +67,11 @@ func (ct *conntrackImpl) Extract(flowLogs []config.GenericMap) []config.GenericM
}
conn, exists := ct.connStore.getConnection(computedHash.hashTotal)
if !exists {
if (ct.config.MaxConnectionsTracked > 0) && (ct.config.MaxConnectionsTracked <= ct.connStore.mom.Len()) {
log.Warningf("too many connections; skipping flow log %v: ", fl)
ct.metrics.inputRecords.WithLabelValues("discarded").Inc()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for tracking the discarded flowlogs

pkg/api/conntrack.go Outdated Show resolved Hide resolved
@@ -23,6 +23,6 @@ type AggregateOperation string
type AggregateDefinition struct {
Name string `yaml:"name,omitempty" json:"name,omitempty" doc:"description of aggregation result"`
GroupByKeys AggregateBy `yaml:"groupByKeys,omitempty" json:"groupByKeys,omitempty" doc:"list of fields on which to aggregate"`
OperationType AggregateOperation `yaml:"operationType,omitempty" json:"operationType,omitempty" doc:"sum, min, max, avg or raw_values"`
OperationType AggregateOperation `yaml:"operationType,omitempty" json:"operationType,omitempty" doc:"sum, min, max, count, avg or raw_values"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Collaborator

@ronensc ronensc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci openshift-ci bot added the lgtm label Feb 8, 2023
@KalmanMeth KalmanMeth merged commit b6979fe into netobserv:main Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow to limit the memory used by FLP
3 participants