-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[filebeat][streaming] - Added support for TLS & forward proxy configs for websockets #41934
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
This pull request doesn't have a |
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
This pull request is now in conflicts. Could you fix it? 🙏
|
@@ -349,6 +349,32 @@ The minimum time to wait between retries. This ensures that retries are spaced o | |||
|
|||
The maximum time to wait between retries. This prevents the retry mechanism from becoming too slow, ensuring that the client does not wait indefinitely between retries. This is crucial in systems where timeouts or user experience are critical. For example, `wait_max` might be set to 10 seconds, meaning that even if the calculated backoff is greater than this, the client will wait at most 10 seconds before retrying. | |||
|
|||
[float] | |||
==== `handshake_timeout` | |||
This specifies the time to wait for the `websocket` handshake and the `http.Upgrade()` operation to complete. This timeout occurs at the application layer and not at the TCP layer. The `default value` is `20` seconds. This setting is specific to the `websocket` streaming input type only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specifies the time to wait for the `websocket` handshake and the `http.Upgrade()` operation to complete. This timeout occurs at the application layer and not at the TCP layer. The `default value` is `20` seconds. This setting is specific to the `websocket` streaming input type only. | |
This specifies the time to wait for the WebSocket handshake and protocol upgrade to complete. This timeout occurs at the application layer and not at the TCP layer. The `default value` is `20` seconds. This setting is specific to the `websocket` streaming input type only. |
Referring to a call is unhelpful to users; discuss the intent not the mechanism in user-facing documentation. Also, it was incorrectly referring to http.Upgrade
which does not exist. It is used by websocket.Upgrader.Upgrade
, but we don't call that except in testing, instead using websocket.Dialer.DialContext
(should we refer to upgrading at all? I think this is an implementation detail and only tenuously relevant). Note that the lib's default is 45s; do we want to match that or is there a reason to use 20s?
I'm curious, is it meaningful to have a handshake timeout that is less than the TCP timeout? Looking at the implementation of the client, the handshake timeout is used to construct a deadline context that must lose the race to return from DialContext
, and this context is passed into the net.Dialer.DialContext
call, meaning that obtaining the netconn must fit within this deadline. Where does the timeout
below come in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@efd6,
What I was going for initially was faster failures at the application layer to achieve better program responsiveness. But then I realised that we might want to wait for TCP retires to overcome any intermittent network issues present and setting a handshake timeout lesser than the tcp timeout defeats that purpose.
The timeout option below actually controls the net.Dialer.DialContext timeout, which by default is set to 90s, half of the os default for TCP timeout window.
So my question is, should we extend the DialContext and handshake timeouts to mirror the TCP time out value of the os ?
As for the documentation, I'll update it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest taking a look at the gorilla code to see how the timeout is propagated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@efd6, I saw the gorilla code just creates a context deadline out of the provided timeout in the dialcontext. So in this case having the handshake value < the dial context doesn't make much sense by default. We can keep them at the same value then if the user feels they need to change they can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not bother adding the handshake timeout since we already provide a mechanism to hand in a dial timeout which achieves the same end.
@efd6, I've addressed the PR suggestions. I was wondering should we back port this to 8.17 ? I know the FF has already happened but this might help out a lot of users currently facing issues. |
Type of change
Proposed commit message
Added support for functional TLS config & proxy configs for websockets by implementing a custom transport object and NetDialContext() and leveraging the httpcommon library. Added supporting tests to accompany
Note:
Tried using in memory certs and keys but our current httpcommon library always asks for a file.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Disruptive User Impact
No enduser impact as this is all additive.
Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs