Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x-pack/filebeat/input/httpjson: add transaction tracer #32412

Merged
merged 6 commits into from
Jul 28, 2022

Conversation

efd6
Copy link
Contributor

@efd6 efd6 commented Jul 20, 2022

What does this PR do?

Why is it important?

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works — tested manually
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@efd6 efd6 self-assigned this Jul 20, 2022
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jul 20, 2022
@efd6 efd6 force-pushed the 32402-httpjson branch from c4fdfa8 to 3e1a23a Compare July 20, 2022 07:18
@efd6 efd6 marked this pull request as ready for review July 20, 2022 07:51
@efd6 efd6 requested review from a team as code owners July 20, 2022 07:51
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@efd6 efd6 requested review from belimawr and rdner and removed request for a team July 20, 2022 07:51
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 20, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-07-28T12:01:48.667+0000

  • Duration: 132 min 10 sec

Test stats 🧪

Test Results
Failed 0
Passed 22536
Skipped 1937
Total 24473

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@efd6
Copy link
Contributor Author

efd6 commented Jul 20, 2022

/test

x-pack/filebeat/input/httpjson/request.go Outdated Show resolved Hide resolved
@@ -12,6 +12,8 @@ import (
"strings"
"time"

"gopkg.in/natefinch/lumberjack.v2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a logging library in Beats: https://github.com/elastic/elastic-agent-libs/tree/main/logp that uses go.uber.org/zap under the hood.

If possible I'd prefer to stick with those ones (preferably elastic-agent-libs/logp) to avoid introducing a new dependency and "logging" library.

What are the advantages of using lumberjack over the ones we already have in Beats?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another alternative is https://pkg.go.dev/github.com/elastic/elastic-agent-libs/file#Rotator. It gives you an io.WriteCloser interface and manages rolling the underlying files.

But it looks like the library got polluted since I last looked with a hardcoded file extension 😢 . https://github.com/elastic/elastic-agent-libs/blob/a09d65fd12dec3efe53c44f05562431f96b1028b/file/rotator.go#L416 IMO is should have used filepath.Ext to get the extension from the configured filename.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the advantages of using lumberjack over the ones we already have in Beats?

Simplicity; this does exactly what is needed and is transparently simple.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh's suggestion looks great and simple. The file extension is a pitty, but we can easily fix that as well. I glanced over the code and it seems to be possible to make the extension configurable without changing the public API.

@mergify
Copy link
Contributor

mergify bot commented Jul 20, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b 32402-httpjson upstream/32402-httpjson
git merge upstream/main
git push upstream 32402-httpjson

@efd6 efd6 force-pushed the 32402-httpjson branch from 3e1a23a to a15efe7 Compare July 20, 2022 22:00
@efd6
Copy link
Contributor Author

efd6 commented Jul 21, 2022

Failure is unrelated. It appears to have come in with https://beats-ci.elastic.co/blue/organizations/jenkins/Beats%2Fbeats/detail/main/666/pipeline. Which does touch code that could explain it — it shows up first then — though I don't know enough about that part of libbeat to be confident of that and it passed the build at the time of the PR. @belimawr Do you think that could be the reason?

@belimawr
Copy link
Contributor

Failure is unrelated. It appears to have come in with https://beats-ci.elastic.co/blue/organizations/jenkins/Beats%2Fbeats/detail/main/666/pipeline. Which does touch code that could explain it — it shows up first then — though I don't know enough about that part of libbeat to be confident of that and it passed the build at the time of the PR. @belimawr Do you think that could be the reason?

I'll take a look at it right now.

@belimawr
Copy link
Contributor

@efd6, yes, there is a bug on my bugfix 🤣

I'm writing the fix-fix PR right now. Another race condition. Thanks a lot for spotting that!!!

@belimawr
Copy link
Contributor

@efd6, yes, there is a bug on my bugfix rofl

I'm writing the fix-fix PR right now. Another race condition. Thanks a lot for spotting that!!!

Here is the fix PR: #32433

@mergify
Copy link
Contributor

mergify bot commented Jul 21, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b 32402-httpjson upstream/32402-httpjson
git merge upstream/main
git push upstream 32402-httpjson

@efd6 efd6 force-pushed the 32402-httpjson branch from 3d3dbd0 to 6f01b63 Compare July 21, 2022 23:30
andrewkroh
andrewkroh previously approved these changes Jul 22, 2022
@mergify
Copy link
Contributor

mergify bot commented Jul 22, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b 32402-httpjson upstream/32402-httpjson
git merge upstream/main
git push upstream 32402-httpjson

@andrewkroh andrewkroh dismissed their stale review July 22, 2022 16:21

Something is not working as expected. I'm not seeing any requests logged.

@@ -42,6 +47,7 @@ func (c *httpClient) do(stdCtx context.Context, req *http.Request) (*http.Respon
return nil, fmt.Errorf("failed to execute http client.Do: %w", err)
}
defer resp.Body.Close()
c.tracer.trace(req, resp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the request might need to be dumped before it is executed in order to access the body before it is drained.

Copy link
Contributor Author

@efd6 efd6 Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add this and run the tests you will see a log file with the request and response logging in "test.log".

diff --git a/x-pack/filebeat/input/httpjson/input_test.go b/x-pack/filebeat/input/httpjson/input_test.go
index 69617d82a5..d48f8d5ceb 100644
--- a/x-pack/filebeat/input/httpjson/input_test.go
+++ b/x-pack/filebeat/input/httpjson/input_test.go
@@ -583,6 +583,7 @@ func TestInput(t *testing.T) {
        for _, testCase := range testCases {
                tc := testCase
                t.Run(tc.name, func(t *testing.T) {
+                       tc.baseConfig["request.tracer.filename"] = "test.log"
                        tc.setupServer(t, tc.handler, tc.baseConfig)
 
                        cfg := conf.MustNewConfigFrom(tc.baseConfig)

No body though, that will need to be fixed.

@efd6 efd6 force-pushed the 32402-httpjson branch from 89b9639 to 5c91223 Compare July 23, 2022 00:53
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm testing against a mock server that responds with an HTTP 503 in a particular case. When this happens the tracer logs "no response (2022/07/25 11:28:46 no response 1658762511476.5).

One idea I had was too see if there is a different hook point such that we could implement a http.RoundTripper for capturing the request/response. I think this could also give us visibility into requests made by retryablehttp... 🤔

@efd6
Copy link
Contributor Author

efd6 commented Jul 26, 2022

That is possible though it adds a fair bit of complexity/indirection. I've fixed the loss of non-erroring non-200 status logging.

@andrewkroh
Copy link
Member

I think the complexity is not too bad. I had to see for myself (it's just a hack), but have a look at 93d5f2e which builds upon your change and uses the config.Request.Tracer. The main benefit is that we can observe all requests and responses, which I think is really important when trying to understand what's happening during a debug session.

In my hack I did choose a structured logger. I figured if I get a big trace log from a user that I'm trying to help, then this format will be easier to analyze in Elasticsearch.

One feature that I thought would be nice to have is a trace.id. I set this up such that you can pass a trace.id to be logged in the http.Request.Context(). But I wasn't sure where is best to inject this. I think I would want to group the initial request, pagination requests, and chaining requests with the same trace.id. Then you can group all the data associated with one "iteration" of the input.

@efd6
Copy link
Contributor Author

efd6 commented Jul 27, 2022

That's approximately the level of complexity that I was thinking it would involve when you include the code in go-examples. I agree that if the intention is for looking at large dumps then a structured logger would be better. Are you happy for me to pull the code you have in go-examples into beats?

One comment on the httplog, the logging of the body is not currently ECS compliant since it dumps the complete serialised request, not the body (same for the response). To fix this we'd drop the use of httputil and just make a local drainBody to get those bytes. WDYT?

@andrewkroh
Copy link
Member

Yeah, you are welcome to pull in any of that code. I did realize that I was including more than what is expected in those ECS body.content fields. I guess it would be easiest to create non-ECS fields like http.{request,response}.Dump (or some other name) than it would be reimplement parts of the httputil helper functions.

@efd6
Copy link
Contributor Author

efd6 commented Jul 28, 2022

The helper is tiny so I'll make use of a proverb.

… req/resp logging

Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
@efd6 efd6 force-pushed the 32402-httpjson branch from 4245175 to 4b72b04 Compare July 28, 2022 03:12
@efd6
Copy link
Contributor Author

efd6 commented Jul 28, 2022

Initially in an internal package in httpjson, but I can imagine that this will have general utility, so it can be moved out later when needed elsewhere.

@efd6
Copy link
Contributor Author

efd6 commented Jul 28, 2022

E2E test failure is unrelated.

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making it ECS compliant.

return rt.txBaseID + "-" + strconv.FormatUint(count, 10)
}

var noResponse = zap.NamedError("error.message", errors.New("unexpected nil response"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked the output of this? I wanted to make sure that it behaves as expected because I recall that ecszap automatically takes regular zap.Error() calls and writes them to error.message and wanted to make sure this did not result in an extra message like error.message.message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two types of error messages:

{"log.level":"debug","@timestamp":"2022-07-28T21:08:54.725+0930","message":"HTTP request","transaction.id":"63GBEPIAVC2HE-1","url.original":"http://127.0.0.1:57587","url.scheme":"http","url.path":"","url.domain":"127.0.0.1","url.port":"57587","url.query":"","http.request.method":"POST","user_agent.original":"Elastic-Filebeat/8.4.0 (darwin; amd64; unknown; 0001-01-01 00:00:00 +0000 UTC)","error.message":[{"error":"failed to read request body: faux"},{"error":"failed to dump request: faux"}],"ecs.version":"1.6.0"}

and

{"log.level":"debug","@timestamp":"2022-07-28T21:12:09.042+0930","message":"HTTP response error","transaction.id":"R3VDV93NVC2HE-1","error":{"message":"unexpected nil response"},"ecs.version":"1.6.0"}

So the noResponse error looks like it is working as intended. I'll fix the other.

New version:
Single

{"log.level":"debug","@timestamp":"2022-07-28T21:24:05.124+0930","message":"HTTP request","transaction.id":"N34CJAOTVG2HE-3","url.original":"http://127.0.0.1:59402","url.scheme":"http","url.path":"","url.domain":"127.0.0.1","url.port":"59402","url.query":"","http.request.method":"GET","user_agent.original":"Elastic-Filebeat/8.4.0 (darwin; amd64; unknown; 0001-01-01 00:00:00 +0000 UTC)","event.original":"GET / HTTP/1.1\r\nHost: 127.0.0.1:59402\r\nUser-Agent: Elastic-Filebeat/8.4.0 (darwin; amd64; unknown; 0001-01-01 00:00:00 +0000 UTC)\r\nAccept: application/json\r\nAccept-Encoding: gzip\r\n\r\n","error.message":"failed to read request body: faux","ecs.version":"1.6.0"}

Double

{"log.level":"debug","@timestamp":"2022-07-28T21:24:02.121+0930","message":"HTTP request","transaction.id":"F31B1AOTVG2HE-2","url.original":"http://127.0.0.1:59398","url.scheme":"http","url.path":"","url.domain":"127.0.0.1","url.port":"59398","url.query":"","http.request.method":"GET","user_agent.original":"Elastic-Filebeat/8.4.0 (darwin; amd64; unknown; 0001-01-01 00:00:00 +0000 UTC)","error.message":["failed to read request body: faux","failed to dump request: faux"],"ecs.version":"1.6.0"}

Round trip

{"log.level":"debug","@timestamp":"2022-07-28T21:24:03.123+0930","message":"HTTP response error","transaction.id":"N34CJAOTVG2HE-2","error":{"message":"unexpected nil response"},"ecs.version":"1.6.0"}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"New version" looks good.

@efd6 efd6 merged commit 765648f into elastic:main Jul 28, 2022
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Filebeat] httpjson - add option for capturing a trace data
4 participants