Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate S3 Input to Filebeat Input V2 #20005

Merged
merged 17 commits into from
Oct 2, 2020
Merged

Migrate S3 Input to Filebeat Input V2 #20005

merged 17 commits into from
Oct 2, 2020

Conversation

kaiyan-sheng
Copy link
Contributor

@kaiyan-sheng kaiyan-sheng commented Jul 16, 2020

Note: This PR is based on #19756 to finish migrating s3 input to use Filebeat input V2.

  • Refactoring

What does this PR do?

Move s3 input to input v2 API.

This change splits the internal s3Input into s3Input and s3Collector. The s3Input is responsible for configuration only.

The unit tests have been modified, but the integration tests need some more work.

Why is it important?

Update to v2 input API.

Checklist

  • My code follows the style guidelines of this project
    - [ ] I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

Manual testing:

  1. Create an SQS queue and an S3 bucket in the same AWS region using Amazon SQS console.
  2. Configure SQS queue by replacing the access policy attached to the queue with the following queue policy:
{
 "Version": "2012-10-17",
 "Id": "example-ID",
 "Statement": [
  {
   "Sid": "example-statement-ID",
   "Effect": "Allow",
   "Principal": {
    "AWS":"*"  
   },
   "Action": [
    "SQS:SendMessage"
   ],
   "Resource": "<SQS-queue-ARN>",
   "Condition": {
      "ArnLike": { "aws:SourceArn": "arn:aws:s3:*:*:<bucket-name>" }
   }
  }
 ]
}
  1. Using the Amazon S3 console, add a notification configuration requesting Amazon S3 to publish events of the s3:ObjectCreated:* type to your Amazon SQS queue.
  2. Upload an object to the S3 bucket and verify the event notification in the Amazon SQS console.
  3. Change filebeat.yml:
filebeat.inputs:
- type: s3
  queue_url:   https://sqs.us-east-1.amazonaws.com/428152502467/test-fb-ks
  credential_profile_name: elastic-beats
  1. Start Filebeat:
./filebeat -e

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 16, 2020
@kaiyan-sheng kaiyan-sheng requested a review from urso July 16, 2020 19:58
@kaiyan-sheng kaiyan-sheng self-assigned this Jul 16, 2020
@kaiyan-sheng kaiyan-sheng added the Team:Platforms Label for the Integrations - Platforms team label Jul 16, 2020
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 16, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 17, 2020

💔 Build Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #20005 updated]

  • Start Time: 2020-10-02T18:25:01.784+0000

  • Duration: 69 min 27 sec

Test stats 🧪

Test Results
Failed 0
Passed 1370
Skipped 130
Total 1500

Steps errors

Expand to view the steps failures

  • Name: Extract

    • Description: tar -xpf source.tgz

    • Duration: 1 min 32 sec

    • Start Time: 2020-10-02T19:01:43.146+0000

    • log

  • Name: Error signal

    • Description: untar: step failed with error script returned exit code 1

    • Duration: 0 min 0 sec

    • Start Time: 2020-10-02T19:02:15.615+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-10-02T19:32:41.773Z] Unable to find image 'alpine:3.4' locally
[2020-10-02T19:32:42.348Z] 3.4: Pulling from library/alpine
[2020-10-02T19:32:42.611Z] c1e54eec4b57: Pulling fs layer
[2020-10-02T19:32:42.874Z] c1e54eec4b57: Verifying Checksum
[2020-10-02T19:32:42.874Z] c1e54eec4b57: Download complete
[2020-10-02T19:32:43.138Z] c1e54eec4b57: Pull complete
[2020-10-02T19:32:43.138Z] Digest: sha256:b733d4a32c4da6a00a84df2ca32791bb03df95400243648d8c539e7b4cce329c
[2020-10-02T19:32:43.138Z] Status: Downloaded newer image for alpine:3.4
[2020-10-02T19:32:45.018Z] + python .ci/scripts/pre_archive_test.py
[2020-10-02T19:32:47.607Z] Copy ./x-pack/filebeat/build into build/x-pack/filebeat/build
[2020-10-02T19:32:47.621Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20005/src/github.com/elastic/beats/build
[2020-10-02T19:32:47.634Z] WARNING: Unknown parameter(s) found for class type 'hudson.tasks.junit.pipeline.JUnitResultsStep': id,stashedTestReports
[2020-10-02T19:32:47.638Z] Recording test results
[2020-10-02T19:32:49.763Z] Stashed 4 file(s)
[2020-10-02T19:32:49.774Z] Archiving artifacts
[2020-10-02T19:32:50.589Z] + python .ci/scripts/search_system_tests.py
[2020-10-02T19:32:50.614Z] [INFO] system-tests='build/x-pack/filebeat/build/system-tests'. If no empty then let's create a tarball
[2020-10-02T19:32:50.961Z] + tar --version
[2020-10-02T19:32:51.305Z] + tar --exclude=x-pack-filebeat--system-tests-linux.tgz -czf x-pack-filebeat--system-tests-linux.tgz build/x-pack/filebeat/build/system-tests
[2020-10-02T19:33:13.494Z] Archiving artifacts
[2020-10-02T19:33:26.173Z] Client: Docker Engine - Community
[2020-10-02T19:33:26.174Z]  Version:           19.03.13
[2020-10-02T19:33:26.174Z]  API version:       1.40
[2020-10-02T19:33:26.174Z]  Go version:        go1.13.15
[2020-10-02T19:33:26.174Z]  Git commit:        4484c46d9d
[2020-10-02T19:33:26.174Z]  Built:             Wed Sep 16 17:02:36 2020
[2020-10-02T19:33:26.174Z]  OS/Arch:           linux/amd64
[2020-10-02T19:33:26.174Z]  Experimental:      false
[2020-10-02T19:33:26.174Z] 
[2020-10-02T19:33:26.174Z] Server: Docker Engine - Community
[2020-10-02T19:33:26.174Z]  Engine:
[2020-10-02T19:33:26.174Z]   Version:          19.03.13
[2020-10-02T19:33:26.174Z]   API version:      1.40 (minimum version 1.12)
[2020-10-02T19:33:26.174Z]   Go version:       go1.13.15
[2020-10-02T19:33:26.174Z]   Git commit:       4484c46d9d
[2020-10-02T19:33:26.174Z]   Built:            Wed Sep 16 17:01:06 2020
[2020-10-02T19:33:26.174Z]   OS/Arch:          linux/amd64
[2020-10-02T19:33:26.174Z]   Experimental:     false
[2020-10-02T19:33:26.174Z]  containerd:
[2020-10-02T19:33:26.174Z]   Version:          1.3.7
[2020-10-02T19:33:26.174Z]   GitCommit:        8fba4e9a7d01810a393d5d25a3621dc101981175
[2020-10-02T19:33:26.174Z]  runc:
[2020-10-02T19:33:26.174Z]   Version:          1.0.0-rc10
[2020-10-02T19:33:26.174Z]   GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
[2020-10-02T19:33:26.174Z]  docker-init:
[2020-10-02T19:33:26.174Z]   Version:          0.18.0
[2020-10-02T19:33:26.174Z]   GitCommit:        fec3683
[2020-10-02T19:33:38.883Z] [INFO] unstashV2: JOB_GCS_BUCKET is set. bucket param got precedency instead.
[2020-10-02T19:33:38.899Z] [INFO] unstashV2: JOB_GCS_CREDENTIALS is set. credentialsId param got precedency instead.
[2020-10-02T19:33:38.967Z] [Google Cloud Storage Plugin] Found 1 files to download from pattern: gs://beats-ci-temp/Beats/beats/PR-20005-13/source/source.tgz
[2020-10-02T19:33:38.986Z] [Google Cloud Storage Plugin] Downloading: Beats/beats/PR-20005-13/source/source.tgz to local path: /var/lib/jenkins/workspace/Beats_beats_PR-20005/source.tgz
[2020-10-02T19:33:52.360Z] + tar --version
[2020-10-02T19:33:52.672Z] + tar -xpf source.tgz
[2020-10-02T19:34:05.286Z] + rm source.tgz
[2020-10-02T19:34:05.452Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20005/src/github.com/elastic/beats
[2020-10-02T19:34:05.461Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20005/src/github.com/elastic/beats/uncategorized-1601665072727
[2020-10-02T19:34:05.527Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20005/src/github.com/elastic/beats/x-pack-filebeat-build-1601667169280
[2020-10-02T19:34:05.887Z] + cat
[2020-10-02T19:34:05.888Z] + /usr/local/bin/runbld ./runbld-test-reports --job-name elastic+beats+pull-request
[2020-10-02T19:34:05.888Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-10-02T19:34:14.073Z] runbld>>> runbld started
[2020-10-02T19:34:14.073Z] runbld>>> 1.6.12/f45d832f2ba0aa2722ab4ec1fda8ad140f027f8b
[2020-10-02T19:34:15.471Z] runbld>>> The following profiles matched the job 'elastic+beats+pull-request' in order of occurrence in the config (last value wins).
[2020-10-02T19:34:15.471Z] runbld>>> Matches in the system config:
[2020-10-02T19:34:15.471Z] runbld>>> - Matched ^elastic\+beats
[2020-10-02T19:34:15.471Z] runbld>>> - Matched ^elastic\+beats\+pull-request
[2020-10-02T19:34:16.874Z] runbld>>> Debug logging enabled.
[2020-10-02T19:34:16.874Z] runbld>>> Storing result
[2020-10-02T19:34:16.874Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-10-02T19:34:16.874Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1597739501209/t/20201002193416-C89F6F31
[2020-10-02T19:34:16.874Z] runbld>>> Adding system facts.
[2020-10-02T19:34:18.273Z] runbld>>> Adding vcs info for the latest commit:  170133f68419cd61158129db8b545f99955249d8
[2020-10-02T19:34:18.273Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-10-02T19:34:18.273Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-10-02T19:34:18.537Z] Processing JUnit reports with runbld...
[2020-10-02T19:34:18.537Z] + echo 'Processing JUnit reports with runbld...'
[2020-10-02T19:34:18.800Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-10-02T19:34:18.800Z] runbld>>> DURATION: 44ms
[2020-10-02T19:34:18.800Z] runbld>>> STDOUT: 40 bytes
[2020-10-02T19:34:18.800Z] runbld>>> STDERR: 49 bytes
[2020-10-02T19:34:18.800Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-10-02T19:34:18.800Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats_PR-20005
[2020-10-02T19:34:19.753Z] runbld>>> Storing build metadata: 
[2020-10-02T19:34:19.753Z] runbld>>> Adding test report.
[2020-10-02T19:34:19.753Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats_PR-20005/src/github.com/elastic/beats
[2020-10-02T19:34:20.705Z] runbld>>> Found 4 test output files
[2020-10-02T19:34:20.968Z] runbld>>> Test output logs contained: Errors: 0 Failures: 0 Tests: 1500 Skipped: 122
[2020-10-02T19:34:21.233Z] runbld>>> Storing result
[2020-10-02T19:34:21.233Z] runbld>>> FAILURES: 0
[2020-10-02T19:34:21.497Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-10-02T19:34:21.497Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1597739501209/t/20201002193416-C89F6F31
[2020-10-02T19:34:21.760Z] runbld>>> Email notification disabled by environment variable.
[2020-10-02T19:34:21.760Z] runbld>>> Slack notification disabled by environment variable.
[2020-10-02T19:34:27.866Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats_PR-20005
[2020-10-02T19:34:28.107Z] [INFO] getVaultSecret: Getting secrets
[2020-10-02T19:34:28.196Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-10-02T19:34:28.814Z] + chmod 755 generate-build-data.sh
[2020-10-02T19:34:28.814Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20005/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20005/runs/13 FAILURE 4166771
[2020-10-02T19:34:28.814Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20005/runs/13/steps/?limit=10000 -o steps-info.json
[2020-10-02T19:34:29.725Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20005/runs/13/tests/?status=FAILED -o tests-errors.json

@kaiyan-sheng kaiyan-sheng changed the title Fix s3_integration_test.go in Filebeat s3 input Migrate S3 Input to Filebeat Input V2 Jul 17, 2020
@kaiyan-sheng kaiyan-sheng marked this pull request as ready for review July 17, 2020 23:09
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

svcSQS := sqs.New(awsConfig)
input.sqs = svcSQS
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to modify the object under test for the test is kind of a red flag to me. It should be possible to create and provide a fully initialized object. Is there something missing in the construction of the collector that forces us to do this here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@urso I moved the code for creating a full initialized object outside of runTest function. Please let me know if this looks less suspicious for you now :)

@urso urso mentioned this pull request Jul 21, 2020
2 tasks
@kaiyan-sheng kaiyan-sheng added the needs_backport PR is waiting to be backported to other branches. label Sep 29, 2020
@kaiyan-sheng kaiyan-sheng added the test-plan Add this PR to be manual test plan label Oct 2, 2020
@kaiyan-sheng kaiyan-sheng merged commit 7e36f5c into elastic:master Oct 2, 2020
@kaiyan-sheng kaiyan-sheng deleted the filebeat-input-v2-s3 branch October 2, 2020 21:08
@kaiyan-sheng kaiyan-sheng added v7.10.0 and removed needs_backport PR is waiting to be backported to other branches. labels Oct 2, 2020
@andresrc andresrc added the test-plan-added This PR has been added to the test plan label Oct 3, 2020
v1v added a commit to v1v/beats that referenced this pull request Oct 5, 2020
…-matches-found

* upstream/master: (21 commits)
  Skip filestream flaky tests (elastic#21490)
  Ignore unsupported metrics in the azure module (elastic#21486)
  Do not run symlink tests on Windows (elastic#21472)
  Map `cloud.account.id` to azure sub id (elastic#21483)
  Add support for app_state metricset (elastic#20639)
  Include original error when metricbeat fails to connect with Kafka (elastic#21484)
  Prompt only when agent is already enrolled (elastic#21473)
  Fix leftover delpoyment example (elastic#21474)
  Bump version to ECS 1.6 in modules without ECS updates (elastic#21455)
  Clarify input type configuration options (elastic#19284)
  Increase index pattern size check to 10MiB (elastic#21487)
  Migrate S3 Input to Filebeat Input V2 (elastic#20005)
  [libbeat] Add configurable exponential backoff for disk queue write errors (elastic#21493)
  Revert "Revert "[JJBB] Set shallow cloning to 10 (elastic#21409)" (elastic#21447)" (elastic#21467)
  Fix format of debug messages in tlscommon (elastic#21482)
  [CI] Change x-pack/auditbeat build events (comments, labels) (elastic#21463)
  [CI] changeset from elastic#20603 was not added to CI2.0 (elastic#21464)
  Add new log file reader for filestream input (elastic#21450)
  [CI] Send slack message with build status (elastic#21428)
  Remove duplicated sources url in dependencies report (elastic#21462)
  ...
kaiyan-sheng added a commit that referenced this pull request Oct 5, 2020
*moving s3 input to v2 input API
Co-authored-by: urso <steffen.siering@elastic.co>

(cherry picked from commit 7e36f5c)
v1v added a commit to v1v/beats that referenced this pull request Oct 5, 2020
* upstream/master: (26 commits)
  [Ingest Manager] Send updating state (elastic#21461)
  [Filebeat][New Fileset] Cisco Umbrella support (elastic#21504)
  [Ingest Manager] Download asc from artifact store specified in spec (elastic#21488)
  Implementation of fileProspector (elastic#21479)
  [Metricbeat] Add latency config option into aws module (elastic#20875)
  Skip filestream flaky tests (elastic#21490)
  Ignore unsupported metrics in the azure module (elastic#21486)
  Do not run symlink tests on Windows (elastic#21472)
  Map `cloud.account.id` to azure sub id (elastic#21483)
  Add support for app_state metricset (elastic#20639)
  Include original error when metricbeat fails to connect with Kafka (elastic#21484)
  Prompt only when agent is already enrolled (elastic#21473)
  Fix leftover delpoyment example (elastic#21474)
  Bump version to ECS 1.6 in modules without ECS updates (elastic#21455)
  Clarify input type configuration options (elastic#19284)
  Increase index pattern size check to 10MiB (elastic#21487)
  Migrate S3 Input to Filebeat Input V2 (elastic#20005)
  [libbeat] Add configurable exponential backoff for disk queue write errors (elastic#21493)
  Revert "Revert "[JJBB] Set shallow cloning to 10 (elastic#21409)" (elastic#21447)" (elastic#21467)
  Fix format of debug messages in tlscommon (elastic#21482)
  ...
@zube zube bot removed the [zube]: Done label Jan 1, 2021
@kaiyan-sheng kaiyan-sheng mentioned this pull request Jan 11, 2022
27 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Platforms Label for the Integrations - Platforms team test-plan Add this PR to be manual test plan test-plan-added This PR has been added to the test plan v7.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants