Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] httpjson - request tracer fails with long URL #35157

Closed
andrewkroh opened this issue Apr 20, 2023 · 7 comments
Closed

[Filebeat] httpjson - request tracer fails with long URL #35157

andrewkroh opened this issue Apr 20, 2023 · 7 comments
Labels
discuss Issue needs further discussion. Filebeat Filebeat Team:Security-Service Integrations Security Service Integrations Team

Comments

@andrewkroh
Copy link
Member

When the request tracer feature with input ID substitution the ID can be come longer than the maximum allowed file name.

An input id can be set to any value by a user and for stateful inputs (e.g. has cursor) Filebeat also appends the URL. So this substituted ID value could get really long.

If the filename surpasses the max file name then you end up with no tracer logs and an error coming out of the logger that is written directly to stderr (bypassing the Beat logger).

This is probably a rare edge case, but given that users might not be able to control the URL (and Filebeat forces it into the input ID) I think consideration should be given to guarding against this problem.

Observed error:

2023-04-20 15:55:54.014538 -0400 EDT m=+1.159689793 write error: error getting log file info: stat ../../logs/httpjson/http-request-trace-httpjson-foo-eb837d4c-5ced-45ed-b05c-de658135e248_https_api.io_v1_reporting_issues_?page=1&perPage=10&sortBy=issueTitle&order=asc&groupBy=issue&key=aHR0cC1yZXF1ZXN0LXRyYWNlLWh0dHBqc29uLWZvby1lYjgzN2Q0Yy01Y2VkLTQ1ZWQtYjA1Yy1kZTY1ODEzNWUyNDhfaHR0cHNfYXBpLnNueWsuaW9fdjFfcmVwb3J0aW5nX2lzc3Vlc18_cGFnZT0xJnBlclBhZ2U9MTAmc29ydEJ5PWlzc3VlVGl0bGUmb3JkZXI9YXNjJmdyb3VwQnk9aXNzdWUubmRqc29u.ndjson: file name too long

Config used:

filebeat.inputs:
  - type: httpjson
    id: httpjson-foo-eb837d4c-5ced-45ed-b05c-de658135e248
    config_version: 2
    publisher_pipeline.disable_host: true
    interval: 1m
    request.url: https://api.io/v1/reporting/issues/?page=1&perPage=10&sortBy=issueTitle&order=asc&groupBy=issue&key=aHR0cC1yZXF1ZXN0LXRyYWNlLWh0dHBqc29uLWZvby1lYjgzN2Q0Yy01Y2VkLTQ1ZWQtYjA1Yy1kZTY1ODEzNWUyNDhfaHR0cHNfYXBpLnNueWsuaW9fdjFfcmVwb3J0aW5nX2lzc3Vlc18/cGFnZT0xJnBlclBhZ2U9MTAmc29ydEJ5PWlzc3VlVGl0bGUmb3JkZXI9YXNjJmdyb3VwQnk9aXNzdWUubmRqc29u
    response.decode_as: application/json
    cursor:
      last_cursor:
        value: '[[.last_response.body]]'
    request.tracer.filename: ../../logs/httpjson/http-request-trace-*.ndjson

output.console.pretty: true

For confirmed bugs, please report:

Related:

@andrewkroh andrewkroh added discuss Issue needs further discussion. Filebeat Filebeat Team:Security-External Integrations labels Apr 20, 2023
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@andrewkroh
Copy link
Member Author

This is not a full solution to the problem but it does avoid the URL being included (like in the case of Fleet). It reads the id from the config rather than getting from the input v2 runner.

diff --git a/x-pack/filebeat/input/httpjson/config.go b/x-pack/filebeat/input/httpjson/config.go
index 74043594a6..4d0cf6c4f8 100644
--- a/x-pack/filebeat/input/httpjson/config.go
+++ b/x-pack/filebeat/input/httpjson/config.go
@@ -14,6 +14,7 @@ import (
 )
 
 type config struct {
+	ID       string          `config:"id"`
 	Interval time.Duration   `config:"interval" validate:"required"`
 	Auth     *authConfig     `config:"auth"`
 	Request  *requestConfig  `config:"request" validate:"required"`
diff --git a/x-pack/filebeat/input/httpjson/input.go b/x-pack/filebeat/input/httpjson/input.go
index 6e1d3e8ca3..5634a9ed9a 100644
--- a/x-pack/filebeat/input/httpjson/input.go
+++ b/x-pack/filebeat/input/httpjson/input.go
@@ -114,7 +114,12 @@ func run(
 	stdCtx := ctxtool.FromCanceller(ctx.Cancelation)
 
 	if config.Request.Tracer != nil {
-		id := sanitizeFileName(ctx.ID)
+		id := ctx.ID
+		if config.ID != "" {
+			// If the user explicitly configured an ID use it.
+			id = config.ID
+		}
+		id = sanitizeFileName(id)
 		config.Request.Tracer.Filename = strings.ReplaceAll(config.Request.Tracer.Filename, "*", id)
 	}
 

@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

@efd6
Copy link
Contributor

efd6 commented Jun 18, 2024

It's worse than this. I have seen recently in a support case where, due to input ID elaboration, ends up with a base path that is short enough to be written into the zip, but then too long to be able to be extracted without significant effort.

@efd6
Copy link
Contributor

efd6 commented Oct 1, 2024

This should be fixed by #40909.

@narph
Copy link
Contributor

narph commented Oct 9, 2024

the pr is merged, can we close this one @efd6?

@efd6
Copy link
Contributor

efd6 commented Oct 9, 2024

It was reverted in #40980 and then reinstated with #40996. This can be closed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs further discussion. Filebeat Filebeat Team:Security-Service Integrations Security Service Integrations Team
Projects
None yet
Development

No branches or pull requests

5 participants