-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for reading from UNIX datagram sockets #22699
Add support for reading from UNIX datagram sockets #22699
Conversation
Pinging @elastic/integrations-services (Team:Services) |
return strings.Contains(err.Error(), windowErrBuffer) | ||
} | ||
|
||
func (l *Listener) Run(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Run returns on error the input stops hard. Are there some errors we can handle gracefully? e.g.
- unix file can not be opened by current user => retry with backoff (user can adjust permissions)
- another filebeat instance is listening on unix socket => retry with backoff (other FB instance might go down => HA setup?)
- UDP port is currently take. Similar to unix socket, if it is taken by another filebeat instance we might want to retry.
- Do want to allow the handler to take down the server? e.g. distinguish between critical or non-critical errors? Or do we mandate that the handler MUST ONLY return an error if it is not Recoverable from within the handler (IO error). On IO error, can we recover by trying to reopen the connection and listen?
All in all, the knowledge if an error is 'fatal' for the input or not lives in the actual inputsource implementation, not here. It looks like you just moved the logic from the current UDP server (refactoring). Maybe let us create an issue instead of fixing/improving error handling here. I would presume that we can improve error handling in the TCP/unix-streaming inputs as well.
} | ||
|
||
l.tg.Go(func(ctx unison.Canceler) error { | ||
connCtx, connCancel := ctxtool.WithFunc(ctxtool.FromCanceller(ctx), func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks similar to run. Do we want 'Start' to call Run or have a common func (l *Listener) readLoop(...)
?
What about error handling in Start
. If listener fails would it make sense to retry similar to Run
and only fail initially if the error is really non-recoverable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about error handling in Start. If listener fails would it make sense to retry similar to Run and only fail initially if the error is really non-recoverable?
To me Start
sounds like a function you would expect it try to start something and return if it cannot. Run
is more like constantly retrying to do something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why "non-recoverable". If the error can be recovered from I would expect the internal runLoop to continue trying to connect and collect data if possible. Otherwise I will be forced to monitor the health of each input (not really supported yet) and restart filebeat until it eventually went healthy.
If some component implements Start
and Run
, I would not expect them to behave differently depending on which method I use (beside Run should block and Start return immediately).
7904cc7
to
29f0f1f
Compare
29f0f1f
to
8a9f0a8
Compare
sock.send(m.encode("utf-8")) | ||
sock = send_over_socket(path, | ||
"<13>Oct 11 22:14:15 wopr.mymachine.co postfix/smtpd[2000]:" | ||
" 'su root' failed for lonvick on /dev/pts/8 {}\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we want to send more than one event in this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every test sends three events with the same message ending with the index of that message. You mean that we should send a different message every time we send an event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But also consider to send very big messages that might not fit the read buffer. Test the edge cases. Plus if something fails it should be more obvious which exact message failed or was missing.
The test failure looks like the Listener blocks in the waitgroup. A full stack trace for all go-routines is missing, but the streaming listener loop seems to not unblock which blocks 'Stop' during shutdown. |
0a3f972
to
413cd27
Compare
jenkins run tests |
/test filebeat |
2 similar comments
/test filebeat |
/test filebeat |
## What does this PR do? This PR adds support for reading from UNIX datagram sockets both from the `unix` input and the `syslog` input. A new option is added to select the type of the socket named `socket_type`. Available options are: `stream` and `datagram`. ## Why is it important? A few applications which send logs over Unix sockets, use datagrams not streams. From now on, Filebeat can accept input from these applications as well. Closes elastic#18632 (cherry picked from commit 3a1d1ae)
## What does this PR do? This PR adds support for reading from UNIX datagram sockets both from the `unix` input and the `syslog` input. A new option is added to select the type of the socket named `socket_type`. Available options are: `stream` and `datagram`. ## Why is it important? A few applications which send logs over Unix sockets, use datagrams not streams. From now on, Filebeat can accept input from these applications as well. Closes #18632 (cherry picked from commit 3a1d1ae)
…-issues * upstream/master: (41 commits) Fix version parser regex for packaging (elastic#22581) Fix local_dynamic documentation and add providers inline doc. (elastic#22657) fix: use proper param name for e2e tests (elastic#22836) [Heartbeat] Fix exit on disabled monitor (elastic#22829) Update Golang to 1.14.12 (elastic#22790) docs: fix setup.template.overwrite typos (elastic#22804) Add docs section for ECS EC2 monitoring (elastic#22784) Fixing logic to keep list of unique cluster UUIDs (elastic#22808) Skip somewhat flaky UDP system test on Windows (elastic#22810) Fix polling node when it is not ready and monitor by hostname (elastic#22666) Skip Filebeat test_shutdown on windows 7 (elastic#22797) Make monitoring Namespace thread-safe (elastic#22640) Drop pkt_dstaddr and pkt_srcaddr when equals to "-" (elastic#22721) Add support for reading from UNIX datagram sockets (elastic#22699) Fix export dashboard command from Elastic Cloud (elastic#22746) Skip flaky winlogbeat test on Windows-7 (elastic#22754) Missing `>` (elastic#22763) (elastic#22766) Fix k8s watcher issue when node access to list nodes and ns (elastic#22714) [Metricbeat/Kibana/stats] Enforce `exclude_usage=true` (elastic#22732) Avoid sending non-numeric floats in cloud foundry integrations (elastic#22634) ...
…dows-7 * upstream/master: (41 commits) Fix version parser regex for packaging (elastic#22581) Fix local_dynamic documentation and add providers inline doc. (elastic#22657) fix: use proper param name for e2e tests (elastic#22836) [Heartbeat] Fix exit on disabled monitor (elastic#22829) Update Golang to 1.14.12 (elastic#22790) docs: fix setup.template.overwrite typos (elastic#22804) Add docs section for ECS EC2 monitoring (elastic#22784) Fixing logic to keep list of unique cluster UUIDs (elastic#22808) Skip somewhat flaky UDP system test on Windows (elastic#22810) Fix polling node when it is not ready and monitor by hostname (elastic#22666) Skip Filebeat test_shutdown on windows 7 (elastic#22797) Make monitoring Namespace thread-safe (elastic#22640) Drop pkt_dstaddr and pkt_srcaddr when equals to "-" (elastic#22721) Add support for reading from UNIX datagram sockets (elastic#22699) Fix export dashboard command from Elastic Cloud (elastic#22746) Skip flaky winlogbeat test on Windows-7 (elastic#22754) Missing `>` (elastic#22763) (elastic#22766) Fix k8s watcher issue when node access to list nodes and ns (elastic#22714) [Metricbeat/Kibana/stats] Enforce `exclude_usage=true` (elastic#22732) Avoid sending non-numeric floats in cloud foundry integrations (elastic#22634) ...
What does this PR do?
This PR adds support for reading from UNIX datagram sockets both from the
unix
input and thesyslog
input. A new option is added to select the type of the socket namedsocket_type
. Available options are:stream
anddatagram
.Why is it important?
A few applications which send logs over Unix sockets, use datagrams not streams. From now on, Filebeat can accept input from these applications as well.
Checklist
- [ ] I have made corresponding change to the default configuration filesCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues
Closes #18632