-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kafka sink silently discards events on connection errors #21031
Comments
Thanks for this report @frankh ! It does sound like there might be a bug with nacking in the sink for connection issues. Can you share:
|
I'm not able to full reproduce it... I've set it up locally and killing kafka seems to consistently result it a 400 error returned to the http client however, this is still a bug as it should be a 500 (server error) or 503 (service unavailable) not 400 (bad request) that means the sink is setting the batch status to Rejected not Errored edit: It looks like vector does report that it's returning 400s for these requests, so not silently discarding, which is good news weirdly our load balancer metrics at the time don't show a 400 error but that may be a bug on our end, |
I've made a fix to the Kafka sink so these will get correctly reported as 500 errors: #21036 I'm not 100% sure on when exactly Rejected vs Errored should be sent, but based on the HTTP source's response codes I assume Rejected should mean the event itself is bad, and Errored means the sink failed for reasons unrelated to the event content |
fixed in #21036 |
A note for the community
Problem
I've noticed the
vector_component_discarded_events_total
metric shows some events are discarded from our kafka sink, despite the fact we have acknowledgements enabledFrom correlating logs when the discards happen it looks like this happens every time there is an intermittent connection failure with Kafka
Silent errors on acknowledged sinks is unacceptable and is a complete blocker for us to use Vector sadly. Is there any way to have the sink NACK these events?
Configuration
Version
0.40.0
Debug Output
No response
Example Data
No response
Additional Context
No response
References
Related:
The text was updated successfully, but these errors were encountered: