-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of custom endpoints in AWS input #41504
Conversation
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
It looks like Definitely a good question on who "Wins" if both are provided |
That's to cover the bucket polling use case, but there's also the subsequent check covering SQS: } else if c.QueueURL != "" && c.AWSConfig.Endpoint != "" {
//nolint:staticcheck // haven't migrated to the new interface yet
o.EndpointResolver = s3.EndpointResolverFromURL(c.AWSConfig.Endpoint)
} and there's a similar check in |
That check was added in this PR though? |
x-pack/filebeat/input/awss3/input.go
Outdated
// For backwards compat: | ||
// If the endpoint does not start with S3, we will use the endpoint resolver to make all SDK requests use the specified endpoint | ||
// If the endpoint does start with S3, we will use the default resolver uses the endpoint field but can replace s3 with the desired service name like sqs | ||
if !strings.HasPrefix(endpointUri.Hostname(), "s3") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should check to see if s3 is in the URL at all -- as we probably want the same behavior (avoid the EndpointResolver for AWS URLs) for vpce endpoints like vpce.s3....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that check was removed then it seems like the only fundamental change in this function is adding Source: awssdk.EndpointSourceCustom,
to the existing resolver. The big question for me is still whether this whole block can be removed in lieu of the existing endpoint handling in the config modifiers (or vice versa).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is what results in a mutable endpoint when using an endpoint with s3
in the leaf of the endpoint url.
I dont think we should remove the check, I do think it should check for s3
in other parts of the domain name thugh.
I think that both svc.EndpointResolverFromURL
and o.endpoint = ...
have the same end effect and the awssdk.EndpointResolverWithOptionsFunc
force returns the exact endpoint
value.
So my guess is as long as those two use-cases remain distinct (when to allow a mutable endpoint hostname), the logic can be combined?
It is added in this PR but it's derived from an identical check in the 8.14 fix https://github.com/elastic/beats/pull/39709/files ( |
Following offline discussion with @strawgate, simplified the custom endpoint resolvers. There were multiple redundant ones, and the only important question was which ones should set an immutable hostname. For now, the compromise that works for most scenarios is: leave the hostname mutable in SQS mode, make it immutable in bucket polling mode. This fixes the major past bugs, and leaves room to make bucket polling mode more permissive in the future as we confirm new working configurations (or to just expose the immutable field directly in the config so users can adjust it if needed). |
I am running a quick test collecting logs from a non-AWS bucket using the aws-s3 input in S3 polling mode. I did something similar in the past; see the zmoog/public-notes#46 public note. However, since Akamai acquired Linode, I want to rerun it. I logged into https://cloud.linode.com/object-storage/buckets and created a new s3-compatible bucket named I used the "IT, Milan" region, so the bucket URL is https://mbranca-test.it-mil-1.linodeobjects.com Relevant docs: I also created access keys to get access/secret keys and the S3 endpoint hostname:
A quick test with the official AWS CLI: $ aws configure --profile linode
AWS Access Key ID [None]: [redacted]
AWS Secret Access Key [None]: [redacted]
Default region name [None]: it-mil-1
Default output format [None]: json
aws s3 ls s3://mbranca-test --profile linode --endpoint-url https://it-mil-1.linodeobjects.com There are no errors and no output (the bucket is empty). $ echo "this is a test" > test.log
$ aws s3 cp test.log s3://mbranca-test --profile linode --endpoint-url https://it-mil-1.linodeobjects.com
upload: ./test.log to s3://mbranca-test/test.log
$ aws s3 ls s3://mbranca-test --profile linode --endpoint-url https://it-mil-1.linodeobjects.com
2024-11-05 22:28:44 15 test.log Let's try with Filebeat using the following config: # filebeat.yml
filebeat.inputs:
- type: aws-s3
access_key_id: [redacted]
secret_access_key: [redacted]
non_aws_bucket_name: mbranca-test
endpoint: 'https://it-mil-1.linodeobjects.com'
number_of_workers: 5
bucket_list_interval: 60s I started Filebeat, and here is the log message from Linode S3-compatible object storage using the aws-s3 input (S3 polling mode) @strawgate, do you have any suggestions for making sure the backport of your original fix is successful? |
/test |
Fix custom endpoint selection in the S3/SQS input (#39718) by porting @strawgate's 8.14 fix (#39709) to main. In addition to the previous fixes, this simplifies the logic for detecting queue region, since the 8.14 version still had some broken cases caused by requiring over-strict endpoint matching, and it was concluded (talking to @strawgate) that there's no advantage to rejecting standard region format from queue URLs just because the endpoint URL is different (if there is a genuine mismatch in the queue and endpoint we'll learn it from the connection attempt, not from `getRegionFromQueueURL`). (cherry picked from commit cf13781)
Fix custom endpoint selection in the S3/SQS input (#39718) by porting @strawgate's 8.14 fix (#39709) to main. In addition to the previous fixes, this simplifies the logic for detecting queue region, since the 8.14 version still had some broken cases caused by requiring over-strict endpoint matching, and it was concluded (talking to @strawgate) that there's no advantage to rejecting standard region format from queue URLs just because the endpoint URL is different (if there is a genuine mismatch in the queue and endpoint we'll learn it from the connection attempt, not from `getRegionFromQueueURL`). (cherry picked from commit cf13781)
I'd run a test with just S3 bucket polling from S3, SQS + S3 from S3, and a test with a third-party S3 compatible solution. |
…#41537) * Fix handling of custom endpoints in AWS input (#41504) Fix custom endpoint selection in the S3/SQS input (#39718) by porting @strawgate's 8.14 fix (#39709) to main. In addition to the previous fixes, this simplifies the logic for detecting queue region, since the 8.14 version still had some broken cases caused by requiring over-strict endpoint matching, and it was concluded (talking to @strawgate) that there's no advantage to rejecting standard region format from queue URLs just because the endpoint URL is different (if there is a genuine mismatch in the queue and endpoint we'll learn it from the connection attempt, not from `getRegionFromQueueURL`). (cherry picked from commit cf13781) * Update CHANGELOG.next.asciidoc auto-merge picked up an unrelated change --------- Co-authored-by: Fae Charlton <fae.charlton@elastic.co>
Fix custom endpoint selection in the S3/SQS input (#39718) by porting @strawgate's 8.14 fix (#39709) to main. In addition to the previous fixes, this simplifies the logic for detecting queue region, since the 8.14 version still had some broken cases caused by requiring over-strict endpoint matching, and it was concluded (talking to @strawgate) that there's no advantage to rejecting standard region format from queue URLs just because the endpoint URL is different (if there is a genuine mismatch in the queue and endpoint we'll learn it from the connection attempt, not from `getRegionFromQueueURL`). (cherry picked from commit cf13781) Co-authored-by: Fae Charlton <fae.charlton@elastic.co>
Fix custom endpoint selection in the S3/SQS input (#39718) by porting @strawgate's 8.14 fix (#39709) to main.
In addition to the previous fixes, this simplifies the logic for detecting queue region, since the 8.14 version still had some broken cases caused by requiring over-strict endpoint matching, and it was concluded (talking to @strawgate) that there's no advantage to rejecting standard region format from queue URLs just because the endpoint URL is different (if there is a genuine mismatch in the queue and endpoint we'll learn it from the connection attempt, not from
getRegionFromQueueURL
).Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration filesCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues