Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Fix s3 input parsing json file without expand_event_list_from_field #19962

Merged
merged 3 commits into from
Jul 22, 2020
Merged

Conversation

kaiyan-sheng
Copy link
Contributor

@kaiyan-sheng kaiyan-sheng commented Jul 15, 2020

What does this PR do?

This PR is to fix s3 input when parsing json files without expand_event_list_from_field config parameter.
During testing I found offset is not working properly for s3 input events, this PR also fix it.

Why is it important?

For some logs, such as Cloudflare, json looks like:

{"id": "0001", "hey": "there", "how": {"are": "you"}}
{"id": "0002", "hope": "you", "are": {"doing": "well"}}
{"id": "0003", "I": "am", "doing": {"O": "K"}}

instead of with a head field like:

{
  "Records": [
    {
      "eventVersion": "1.05",
      "userIdentity": {
        "type": "AssumedRole",
        "principalId": "AROAJ3UFBPIHDEKKCOI2G:AssumeRoleSession",
        "arn": "arn:aws:sts::123456789012:assumed-role/CloudHealth/AssumeRoleSession",
        "accountId": "123456789012"
      }
]

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

Create a test log file:

$ cat s3filebeat.log 
{"id": "0001", "hey": "there", "how": {"are": "you"}}
{"id": "0002", "hope": "you", "are": {"doing": "well"}}
{"id": "0003", "I": "am", "doing": {"O": "K"}}

Gzip the log file:

gzip s3filebeat.log 

Upload the file to S3 bucket and add property:

$ aws --profile PROFILE s3api put-object --body ./s3filebeat.log.gz --bucket lucas-test-filebeat-s3 --content-encoding gzip --content-type application/json --key s3filebeat.log.gz
{
    "ETag": "\"955ed9f01b6ee38dbba167daab9ebbbb\""
}

or manually upload this file to s3 bucket and change the property there.

Change filebeat input to s3 in filebeat.yml

filebeat.inputs:
- type: s3
  queue_url: https://sqs.us-east-1.amazonaws.com/428152502467/test-fb-ks
  credential_profile_name: elastic-beats

Run ./filebeat -e

Related issues

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 15, 2020
@kaiyan-sheng kaiyan-sheng self-assigned this Jul 15, 2020
@kaiyan-sheng kaiyan-sheng added the Team:Platforms Label for the Integrations - Platforms team label Jul 15, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 15, 2020
@kaiyan-sheng kaiyan-sheng added the in progress Pull request is currently in progress. label Jul 15, 2020
@lag13
Copy link

lag13 commented Jul 17, 2020

I just tested this and it works for my use case! Thanks again for working on this @kaiyan-sheng

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 21, 2020

💚 Build Succeeded

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #19962 updated]

  • Start Time: 2020-07-21T17:46:10.899+0000

  • Duration: 51 min 24 sec

Test stats 🧪

Test Results
Failed 0
Passed 2434
Skipped 385
Total 2819

@kaiyan-sheng kaiyan-sheng added [zube]: In Review review needs_backport PR is waiting to be backported to other branches. and removed [zube]: In Progress in progress Pull request is currently in progress. labels Jul 21, 2020
@zube zube bot removed the needs_backport PR is waiting to be backported to other branches. label Jul 21, 2020
@kaiyan-sheng kaiyan-sheng merged commit 2bf84dd into elastic:master Jul 22, 2020
@kaiyan-sheng kaiyan-sheng deleted the s3_json branch July 22, 2020 13:11
kaiyan-sheng added a commit that referenced this pull request Jul 22, 2020
…without expand_event_list_from_field (#20135)

* [Filebeat] Fix s3 input parsing json file without expand_event_list_from_field (#19962)

* Fix s3 input parsing json file without expand_event_list_from_field

(cherry picked from commit 2bf84dd)

* update changelog
kaiyan-sheng added a commit that referenced this pull request Jul 22, 2020
…rom_field (#19962) (#20134)

* Fix s3 input parsing json file without expand_event_list_from_field

(cherry picked from commit 2bf84dd)
melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020
…rom_field (elastic#19962)

* Fix s3 input parsing json file without expand_event_list_from_field
@zube zube bot removed the [zube]: Done label Oct 21, 2020
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…n file without expand_event_list_from_field (elastic#20135)

* [Filebeat] Fix s3 input parsing json file without expand_event_list_from_field (elastic#19962)

* Fix s3 input parsing json file without expand_event_list_from_field

(cherry picked from commit 9cf6b12)

* update changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review Team:Platforms Label for the Integrations - Platforms team v7.9.0 v7.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filebeat S3 input plugin cannot parse jsonl file with content-type set as application/json
5 participants