Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libbeat] Fix encoding and file offset issues in the disk queue #26484

Merged
merged 8 commits into from
Jun 28, 2021

Conversation

faec
Copy link
Contributor

@faec faec commented Jun 24, 2021

What does this PR do?

This PR fixes #26335 (incorrect decoding of multi-byte characters) by changing the disk queue serialization format to CBOR rather than JSON. It also fixes some file position calculations that weren't properly updated along with the schema change in #22970.

There are no user-visible changes in this PR other than the queue working correctly on new segment files; the underlying schema is altered, but files created by 7.13 or earlier will deserialize with JSON as before.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

@faec faec added bug Team:Elastic-Agent Label for the Agent team backport-v7.14.0 Automated backport with mergify labels Jun 24, 2021
@faec faec self-assigned this Jun 24, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jun 24, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jun 24, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #26484 updated

  • Start Time: 2021-06-24T19:20:08.446+0000

  • Duration: 147 min 39 sec

  • Commit: 9f88e81

Test stats 🧪

Test Results
Failed 0
Passed 47774
Skipped 5274
Total 53048

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 47774
Skipped 5274
Total 53048

Copy link
Contributor

@fearful-symmetry fearful-symmetry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only partly understand the surrounding code, but the PR itself LGTM.

// A test to make sure serialization works correctly on multi-byte characters.
func TestSerializeMultiByte(t *testing.T) {
asciiOnly := "{\"name\": \"Momotaro\"}"
multiBytes := "{\"name\": \"桃太郎\"}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's create a table driven test and separate the 2 cases into separate tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -173,6 +173,9 @@ outerLoop:
// This can only happen if the queue is being closed; abort.
break
}
// We're creating a new segment file, set the initial bytes written
// to the header size.
curSegmentResponse.bytesWritten = wl.currentSegment.headerSize()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a bug fix. Did we introduce the problem when moving to absolute adresses, or does it fix an issue we have had before? In the later case we might want to add a changelog entry (and maybe a separate PR).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this was just an oversight in the previous (absolute address) PR, so it hasn't hit a release.

@faec faec merged commit a3b447a into elastic:master Jun 28, 2021
@faec faec deleted the disk-queue-encoding branch June 28, 2021 15:06
mergify bot pushed a commit that referenced this pull request Jun 28, 2021
faec added a commit that referenced this pull request Jun 28, 2021
…) (#26528)

(cherry picked from commit a3b447a)

Co-authored-by: Fae Charlton <fae.charlton@elastic.co>
mdelapenya added a commit to mdelapenya/beats that referenced this pull request Jun 29, 2021
* master:
  Osquerybeat: set the raw index name to supress the timestamp suffix (elastic#26545)
  [Heartbeat] add screenshots config to synthetics (elastic#26455)
  [Elastic Agent] Use http2 to connect to Fleet Server. (elastic#26474)
  Remove all docs about  Beats central management (elastic#26399)
  update data.json for gcp billing (elastic#26506)
  Skip x-pack metricbeat tests (elastic#26537)
  [Elastic Agent] Fix issue with FLEET_CA not being used with Fleet Server in container (elastic#26529)
  Add changelog entry for  elastic#26224 (elastic#26531)
  Add inttests for the x-pack/metricbeat on a PR/branches basis (elastic#26526)
  Suppress too many errors (elastic#26224)
  Fix master's linting issue (elastic#26517)
  [libbeat] Fix encoding and file offset issues in the disk queue (elastic#26484)
  Add log_group_name_prefix config option for aws-cloudwatch input (elastic#26187)
  Update shared-deduplication.asciidoc (elastic#26492)
  Add Recorded Future support to threatintel module (elastic#26481)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v7.14.0 Automated backport with mergify bug Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Diskqueue garbles data containing multi-byte characters
4 participants