Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ingest stage to provide synthetic workload for benchmarks #395

Merged
merged 8 commits into from
Mar 28, 2023

Conversation

KalmanMeth
Copy link
Collaborator

@KalmanMeth KalmanMeth commented Feb 27, 2023

This PR introduces ingest_synthetic stage to provide a steady stream of flow logs for a configured number of simulated connections. This may be used to control the input flows to be able to perform benchmarking and memory usages of stages.
To configure the stage, add in the config file the following fields:

parameters:
- name: ingest_syn
  ingest:
    type: synthetic
    synthetic:
      connections: <nnn>
      batchMaxLen: <nnn>
      flowLogsPerMin: <nnn>

@KalmanMeth KalmanMeth requested a review from ronensc March 1, 2023 11:09
@codecov
Copy link

codecov bot commented Mar 5, 2023

Codecov Report

Merging #395 (1db643a) into main (99a2598) will increase coverage by 0.29%.
The diff coverage is 88.75%.

@@            Coverage Diff             @@
##             main     #395      +/-   ##
==========================================
+ Coverage   64.09%   64.39%   +0.29%     
==========================================
  Files          92       94       +2     
  Lines        6501     6560      +59     
==========================================
+ Hits         4167     4224      +57     
- Misses       2094     2096       +2     
  Partials      240      240              
Flag Coverage Δ
unittests 64.39% <88.75%> (+0.29%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/pipeline/pipeline_builder.go 70.60% <0.00%> (-0.46%) ⬇️
pkg/test/utils.go 72.66% <ø> (-1.26%) ⬇️
pkg/pipeline/utils/connections.go 85.71% <85.71%> (ø)
pkg/pipeline/ingest/ingest_synthetic.go 93.75% <93.75%> (ø)
pkg/config/config.go 66.66% <100.00%> (+3.33%) ⬆️

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@KalmanMeth KalmanMeth linked an issue Mar 6, 2023 that may be closed by this pull request
Comment on lines 44 to 49
metricsProcessed = operational.DefineMetric(
"ingest_synthetic_flows_processed",
"Number of metrics processed",
operational.TypeCounter,
"stage",
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the variable name and the help string should be updated to reflect that it processes flow logs rather than metrics

Comment on lines 52 to 53
// IngestSynthetic Ingest generates flow logs according to provided parameters
func (ingestS *IngestSynthetic) Ingest(out chan<- config.GenericMap) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// IngestSynthetic Ingest generates flow logs according to provided parameters
func (ingestS *IngestSynthetic) Ingest(out chan<- config.GenericMap) {
// Ingest generates flow logs according to provided parameters
func (ingestS *IngestSynthetic) Ingest(out chan<- config.GenericMap) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

next := 0

// compute time interval between batches
ticker := time.NewTicker(time.Duration(int(time.Minute*time.Duration(ingestS.params.BatchMaxLen)) / ingestS.params.FlowLogsPerMin))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The computation of the interval between batches looks complex to me.
I don't think I understand it.
What does it mean to create a Duration from BatchMaxLen and multiply it by time.Minute?

time.Duration(ingestS.params.BatchMaxLen)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a comment


// Start collecting flows from the ingester and ensure we have the specified number of distinct connections
connectionMap := make(map[connection]int)
for i := 0; i < (3 * connections); i++ {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the hardcoded 3 the batchMaxLen? or just an arbitrary number to make sure we have multiple flow logs per connection?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just to have (many) more flow logs than connections, and to verify that we accumulate the proper number of connections with multiple flow logs per connection.

Comment on lines 101 to 120
jsonIngestSynthetic := api.IngestSynthetic{}
if params.Ingest != nil || params.Ingest.Synthetic != nil {
jsonIngestSynthetic = *params.Ingest.Synthetic
}
if jsonIngestSynthetic.Connections == 0 {
jsonIngestSynthetic.Connections = defaultConnections
}
if jsonIngestSynthetic.FlowLogsPerMin == 0 {
jsonIngestSynthetic.FlowLogsPerMin = defaultFlowLogsPerMin
}
if jsonIngestSynthetic.BatchMaxLen == 0 {
jsonIngestSynthetic.BatchMaxLen = defaultBatchLen
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't use "json" as part of the variable name of jsonIngestSynthetic because it's not limited to json only. Maybe "conf" could replace it?

Copy link
Collaborator

@ronensc ronensc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my feedback and adding the comments

type IngestSynthetic struct {
params api.IngestSynthetic
exitChan <-chan struct{}
metricsProcessed prometheus.Counter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename metricsProcessed here as well

Comment on lines 79 to 94
for flowsLeft > 0 {
remainder := nLogs - next
if subBatchLen > remainder {
subBatchLen = remainder
}
log.Debugf("flowsLeft = %d, remainder = %d, subBatchLen = %d", flowsLeft, remainder, subBatchLen)
subBatch := flowLogs[next : next+subBatchLen]
ingestS.sendBatch(subBatch, out)
ingestS.metricsProcessed.Add(float64(subBatchLen))
flowsLeft -= subBatchLen
next += subBatchLen
if subBatchLen == remainder {
next = 0
subBatchLen = flowsLeft
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explored a different approach to achieve the goal of this loop:
https://github.com/ronensc/flowlogs-pipeline/blob/4266dc2e507e89c007a521d6eaeb7bc34de64abd/pkg/pipeline/ingest/ingest_synthetic.go#L73-L81

Please let me know if it is indeed equivalent and if you think it is clearer.
Since we loop over the sub-batch anyway in sendBatch() I incorporated it into the loop and removed sendBatch().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your suggestion is much simpler and cleaner. Done.

@openshift-ci
Copy link

openshift-ci bot commented Mar 28, 2023

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@KalmanMeth KalmanMeth merged commit a31a0c0 into netobserv:main Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

connection focused flowlogs simulator
2 participants