Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluent Forward Receiver #1173

Merged
merged 1 commit into from
Jul 22, 2020
Merged

Fluent Forward Receiver #1173

merged 1 commit into from
Jul 22, 2020

Conversation

keitwb
Copy link
Contributor

@keitwb keitwb commented Jun 24, 2020

This can receive logs in the Fluent Forward Protocol
via either TCP or Unix sockets. It supports all of the required features of
the spec.

It currently does not support the optional handshake aspect of the protocol,
but will acknowledge chunk options per the spec.

All log entries received within a short time window of each other will be put
into the same ResourceLog instance before being emitted to the next consumer in
the pipline. The Resource object is currently not populated in any way.

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start, thanks.
My primary concern is about the usage of Append vs Resize and performance implications of that.

receiver/fluentforwardreceiver/config.go Outdated Show resolved Hide resolved
cmd/pdatagen/internal/base_structs.go Outdated Show resolved Hide resolved
switch ev := event.(type) {
case *protocol.MessageEvent:
lr := convertMessageEvent(ev)
logSlice.Append(lr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Append which calls Resize may be slow if we have large batches. Worth profiling for large batches (e.g. for thousands of events).
Alternatively, is it possible to use Resize upfront? Initially we know that the size is going to be in the ballpark of len(events), right?
If you use Resize you can't use Append anymore, you will need to use At() and track the index yourself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed Append to be as efficient as normal slice append is in Go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good that Append is fast however convertMessageEvent still returns a pointer, so we make a separate allocation per log record. If we call Resize before the for loop there will be a single allocation for the entire batch.

Again, I do not think having an Append is a good idea generally for slice creation. It allows/encourages to write code that makes more allocations than the Resize version which forces to combine the allocations. I may be missing something, please correct me if I am wrong.

We could merge this PR as is since Append is likely not going to affect performance a lot in this case but I am worried that the Append-based approach will bite us in the future and we will not be able to remove Append due to compatibility reasons. Because of that unless you have other reasons to keep Append I think it is best to remove it although its impact in this particular case may be negligible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Append is now only used where the size of events cannot be determined upfront. Without Append in those cases, you would have to parse all of the events to some big slice of intermediate data structures and then do the Resize at the end, which is almost certainly going to result in worse performance due to all of the allocations of the intermediate data structures that would get thrown away.

receiver/fluentforwardreceiver/conversion.go Outdated Show resolved Hide resolved
}

func convertMessageEvent(me *protocol.MessageEvent) *pdata.LogRecord {
lr := pdata.NewLogRecord()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewLogRecord is slow since it does an allocation.
It is faster to accept as a function parameter the LogRecord that is returned by At() in the caller and populate it here instead of creating, returning it and Append-ing it. This is the intended way to use the API. The current usage will result in suboptimal performance and it is enabled by Append, so I would drop Append to avoid provoking such usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new implementation of Append will be roughly the same since it will only call NewLogRecord once now.

receiver/fluentforwardreceiver/protocol/types.go Outdated Show resolved Hide resolved
receiver/fluentforwardreceiver/protocol/types.go Outdated Show resolved Hide resolved
receiver/fluentforwardreceiver/protocol/types.go Outdated Show resolved Hide resolved
receiver/fluentforwardreceiver/receiver.go Outdated Show resolved Hide resolved
receiver/fluentforwardreceiver/server.go Show resolved Hide resolved
@tigrannajaryan tigrannajaryan self-assigned this Jun 29, 2020
@codecov
Copy link

codecov bot commented Jun 30, 2020

Codecov Report

Merging #1173 into master will decrease coverage by 0.43%.
The diff coverage is 76.95%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1173      +/-   ##
==========================================
- Coverage   90.50%   90.07%   -0.44%     
==========================================
  Files         219      231      +12     
  Lines       15521    16033     +512     
==========================================
+ Hits        14047    14441     +394     
- Misses       1060     1136      +76     
- Partials      414      456      +42     
Impacted Files Coverage Δ
exporter/exportertest/sink_exporter.go 96.55% <0.00%> (-3.45%) ⬇️
receiver/fluentforwardreceiver/conversion.go 63.39% <63.39%> (ø)
receiver/fluentforwardreceiver/testdata/parse.go 68.42% <68.42%> (ø)
receiver/fluentforwardreceiver/server.go 79.00% <79.00%> (ø)
receiver/fluentforwardreceiver/heartbeat.go 80.00% <80.00%> (ø)
receiver/fluentforwardreceiver/receiver.go 94.11% <94.11%> (ø)
internal/data/common.go 100.00% <100.00%> (ø)
receiver/fluentforwardreceiver/ack.go 100.00% <100.00%> (ø)
receiver/fluentforwardreceiver/collector.go 100.00% <100.00%> (ø)
receiver/fluentforwardreceiver/factory.go 100.00% <100.00%> (ø)
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc89783...3d46580. Read the comment docs.

@@ -0,0 +1,25 @@
package v1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think opentelemetry-proto-gen directory contains generated files only. Can we move this file somewhere else?


// Configures the receiver to use TLS.
// The default value is nil, which will cause the receiver to not use TLS.
// TLSCredentials *configtls.TLSSetting `mapstructure:"tls, omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to be uncommented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, removing it for now. We can add back later if people really want TLS support but given the target use case of localhost traffic it seems excessive.

switch ev := event.(type) {
case *protocol.MessageEvent:
lr := convertMessageEvent(ev)
logSlice.Append(lr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good that Append is fast however convertMessageEvent still returns a pointer, so we make a separate allocation per log record. If we call Resize before the for loop there will be a single allocation for the entire batch.

Again, I do not think having an Append is a good idea generally for slice creation. It allows/encourages to write code that makes more allocations than the Resize version which forces to combine the allocations. I may be missing something, please correct me if I am wrong.

We could merge this PR as is since Append is likely not going to affect performance a lot in this case but I am worried that the Append-based approach will bite us in the future and we will not be able to remove Append due to compatibility reasons. Because of that unless you have other reasons to keep Append I think it is best to remove it although its impact in this particular case may be negligible.

@keitwb keitwb marked this pull request as ready for review July 13, 2020 15:11
Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except the comment about tests.

Comment on lines 25 to 27
otlpcommon "go.opentelemetry.io/collector/internal/data/opentelemetry-proto-gen/common/v1"
logsproto "go.opentelemetry.io/collector/internal/data/opentelemetry-proto-gen/logs/v1"
otlpresource "go.opentelemetry.io/collector/internal/data/opentelemetry-proto-gen/resource/v1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not advised to use otlp structs directly in component code or int component tests. OTLP structs may change over time (e.g. by using custom gogoproto data types). We provide wrappers for the purpose of encapsulating and isolating component code from such changes.
I highly advise to rewrite these tests not to use otlp structs at all. Work with pdata only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I added a test util to easily construct logs in a declarative manner using the pdata methods only.

receiver/fluentforwardreceiver/factory.go Outdated Show resolved Hide resolved
@keitwb
Copy link
Contributor Author

keitwb commented Jul 20, 2020

Also @tigrannajaryan, wherever the size of the log records can be predetermined I got rid of the use of Append, but it is still in a few places where the event count cannot be determined upfront due to the msgpack protocol (e.g. packed forward events).

This can receive logs in the [Fluent Forward
Protocol](https://github.com/fluent/fluentd/wiki/Forward-Protocol-Specification-v1) via either TCP or Unix sockets.  It supports all of the required features of the spec.

It currently does not support the optional handshake aspect of the
protocol, but will acknowledge chunk options per the spec.

All log entries received within a short time window of each other will
be put into the same ResourceLog instance before being emitted to the
next consumer in the pipline.  The Resource object is currently not
populated in any way.
Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great job.
Do you feel that more coverage is needed anywhere?

@tigrannajaryan tigrannajaryan merged commit d38076e into open-telemetry:master Jul 22, 2020
@bogdandrutu
Copy link
Member

@tigrannajaryan please make sure new code is added with the right coverage. Was a hard effort for me to improve it to this level

@keitwb keitwb deleted the fluentbit-forward-receiver branch July 22, 2020 15:14
hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this pull request Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants