Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow for json/ndjson content type with charset #32767

Merged
merged 6 commits into from
Aug 25, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ https://github.com/elastic/beats/compare/v8.2.0\...main[Check the HEAD diff]
- Fix handling of Checkpoint event for R81. {issue}32380[32380] {pull}32458[32458]
- Fix a hang on `apt-get update` stage in packaging. {pull}32580[32580]
- gcp-pubsub input: Restart Pub/Sub client on all errors. {issue}32550[32550] {pull}32712[32712]
- Fix not parsing as json when `json` and `ndjson` content types have charset information in `aws-s3` input {pull}32767[32767]

*Heartbeat*

Expand Down
19 changes: 12 additions & 7 deletions x-pack/filebeat/input/awss3/input_integration_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ import (
"path"
"path/filepath"
"runtime"
"strings"
"testing"
"time"

Expand Down Expand Up @@ -88,7 +89,6 @@ file_selectors:
-
regex: 'events-array.json$'
expand_event_list_from_field: Events
content_type: application/json
include_s3_metadata:
- last-modified
- x-amz-version-id
Expand All @@ -97,7 +97,6 @@ file_selectors:
- Content-Type
-
regex: '\.(?:nd)?json(\.gz)?$'
content_type: application/json
-
regex: 'multiline.txt$'
parsers:
Expand All @@ -117,7 +116,6 @@ file_selectors:
-
regex: 'events-array.json$'
expand_event_list_from_field: Events
content_type: application/json
include_s3_metadata:
- last-modified
- x-amz-version-id
Expand All @@ -126,7 +124,6 @@ file_selectors:
- Content-Type
-
regex: '\.(?:nd)?json(\.gz)?$'
content_type: application/json
-
regex: 'multiline.txt$'
parsers:
Expand Down Expand Up @@ -328,11 +325,19 @@ func uploadS3TestFiles(t *testing.T, region, bucket string, filenames ...string)
t.Fatalf("Failed to open file %q, %v", filename, err)
}

contentType := ""
if strings.HasSuffix(filename, "ndjson") || strings.HasSuffix(filename, "ndjson.gz") {
contentType = "let-CI-fail-" + contentTypeNDJSON + "; charset=UTF-8"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the reason for this let-CI-fail?

Copy link
Author

@aspacca aspacca Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to asses that the integration tests were run in CI, forcing a failure

they don't: CI is green anyway

(having let-CI-fail- as suffix https://github.com/elastic/beats/pull/32767/files#diff-f345fd6a1f5ea9523117d4ead2e5f1d13fb82eb1c65a089fd34fcdd514916a96R156 will be false)

} else if strings.HasSuffix(filename, "json") || strings.HasSuffix(filename, "json.gz") {
contentType = "let-CI-fail-" + contentTypeJSON + "; charset=UTF-8"
}

// Upload the file to S3.
result, err := uploader.Upload(context.Background(), &s3.PutObjectInput{
Bucket: aws.String(bucket),
Key: aws.String(filepath.Base(filename)),
Body: bytes.NewReader(data),
Bucket: aws.String(bucket),
Key: aws.String(filepath.Base(filename)),
Body: bytes.NewReader(data),
ContentType: aws.String(contentType),
})
if err != nil {
t.Fatalf("Failed to upload file %q: %v", filename, err)
Expand Down
2 changes: 1 addition & 1 deletion x-pack/filebeat/input/awss3/s3_objects.go
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ func (p *s3ObjectProcessor) ProcessS3Object() error {

// Process object content stream.
switch {
case contentType == contentTypeJSON || contentType == contentTypeNDJSON:
case strings.HasPrefix(contentType, contentTypeJSON) || strings.HasPrefix(contentType, contentTypeNDJSON):
err = p.readJSON(reader)
default:
err = p.readFile(reader)
Expand Down