Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azure_blob failed to upload logs even when acknowledgements are enabled #19741

Open
tgib23 opened this issue Jan 30, 2024 · 1 comment
Open
Labels
sink: azure_blob Anything `azure_blob` sink related type: bug A code related bug.

Comments

@tgib23
Copy link

tgib23 commented Jan 30, 2024

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

I'm testing Vector's azure_blob to upload logs from Kafka. I realized ERROR logs like below, and when that happens, I actually miss the logs from Azure Blob. I've set acknowledgement as enabled, but Vector just keeps on uploading the new logs, so in the end, the log /container/topics/audit-logs-v2/year=2023/month=12/day=31/vector-1703996260-6d0af193-9f7d-4671-91e6-be18466bd55a.log.gz was not uploaded, and we lost the log.

2023-12-31T04:17:58.524831Z ERROR sink{component_kind="sink" component_id=azure_blob_basic component_type=azure_blob}:request{request_id=8679}: vector::sinks::util::retries: Unexpected error type; dropping the request. error=failed to execute `reqwest` request internal_log_rate_limit=true
2023-12-31T04:17:58.524869Z  WARN sink{component_kind="sink" component_id=azure_blob_basic component_type=azure_blob}:request{request_id=8679}: vector::sinks::util::adaptive_concurrency::controller: Unhandled error response. error=failed to execute `reqwest` request internal_log_rate_limit=true
2023-12-31T04:17:58.532435Z ERROR sink{component_kind="sink" component_id=azure_blob_basic component_type=azure_blob}:request{request_id=8679}: vector_common::internal_event::service: Service call failed. No retries or retries exhausted. error=Some(Error { context: Full(Custom { kind: Io, error: reqwest::Error { kind: Request, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("xxx.blob.core.windows.net")), port: None, path: "/container/topics/audit-logs-v2/year=2023/month=12/day=31/vector-1703996260-6d0af193-9f7d-4671-91e6-be18466bd55a.log.gz", query: None, fragment: None }, source: hyper::Error(Io, Os { code: 104, kind: ConnectionReset, message: "Connection reset by peer" }) } }, "failed to execute `reqwest` request") }) request_id=8679 error_type="request_failed" stage="sending" internal_log_rate_limit=true
2023-12-31T04:17:58.532465Z ERROR sink{component_kind="sink" component_id=azure_blob_basic component_type=azure_blob}:request{request_id=8679}: vector_common::internal_event::component_events_dropped: Events dropped intentional=false count=337172 reason="Service call failed. No retries or retries exhausted." internal_log_rate_limit=true

I think this is due to Vector mistreats the event and acknowledgement is processed even in this ERROR.

Additionally, Vector ensures that the batch notifier for an event is always updated, whether or not the event made it to a sink. This ensures that if an event is intentionally dropped (for example, by using a [filter][filter] transform) or even unintentionally dropped (maybe Vector had a bug, uh oh!), we still update the batch notifier to indicate the processing status of the event.`

https://vector.dev/docs/about/under-the-hood/architecture/end-to-end-acknowledgements/

Configuration

sources: {
                kafka_sink_basic: {
                  auto_offset_reset: "latest",
                  type: "kafka",
                  bootstrap_servers: bootstrap_url,
                  group_id: "vector-basic",
                  topics: [ "audit-logs" ],
                },
              },
              sinks: {
                azure_blob_basic: {
                  type: "azure_blob",
                  connection_string: "${CON_STRING}",
                  endpoint: endpoint_url,
                  container_name: "container",
                  inputs: [ 'kafka_sink_basic' ],
                  blob_prefix: "topics/audit-logs-v2/year=%Y/month=%m/day=%d/vector-",
                  batch: {
                    max_bytes: config.basicMaxBytes,
                    timeout_secs: config.basicTimeoutSecs,
                  },
                  encoding: {
                    codec: "raw_message",
                  },
                  buffer: {
                    when_full: "block",
                  },
                  acknowledgements: {
                    enabled: true,
                  },
                  compression: "gzip",
                },

Version

0.34.1

Debug Output

No response

Example Data

No response

Additional Context

No response

References

No response

@tgib23 tgib23 added the type: bug A code related bug. label Jan 30, 2024
@jszwedko jszwedko added the sink: azure_blob Anything `azure_blob` sink related label Jan 30, 2024
@jszwedko
Copy link
Member

Related #10870

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sink: azure_blob Anything `azure_blob` sink related type: bug A code related bug.
Projects
None yet
Development

No branches or pull requests

2 participants