Backport #16482 to 7.17: Bugfix for BufferedTokenizer to completely consume lines in case of lines bigger then sizeLimit #16577
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Non clean backport of #16482
The differences are:
data.convertToString().split(context, delimiter, MINUS_ONE);
instead ofdata.convertToString().split(delimiter, -1);
org.logstash.RubyTestBase
which was introduced in Refactor: drop redundant (jruby-complete.jar) dependency #13159Arrays.asList
vsList.of
assertThrows
method from JUnit5 not available in JUnit4 so reimplemented in the testFixes the behaviour of the tokenizer to be able to work properly when buffer full conditions are met.
Updates BufferedTokenizerExt so that can accumulate token fragments coming from different data segments. When a "buffer full" condition is matched, it record this state in a local field so that on next data segment it can consume all the token fragments till the next token delimiter. Updated the accumulation variable from RubyArray containing strings to a StringBuilder which contains the head token, plus the remaining token fragments are stored in the input array. Furthermore it translates the
buftok_spec
tests into JUnit tests.Release notes
What does this PR do?
Why is it important/What is the impact to the user?
Checklist
Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs