Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] LogStructureFinderManagerTests#testFindCharsetGivenBinary fails reproducibly #33227

Closed
cbuescher opened this issue Aug 29, 2018 · 1 comment · Fixed by #33234
Closed
Assignees
Labels
:ml Machine learning >test-failure Triaged test failures from CI

Comments

@cbuescher
Copy link
Member

Build: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+periodic/2636/console

I can reproduce locally on 6.x with:

./gradlew :x-pack:plugin:ml:log-structure-finder:test \
  -Dtests.seed=BB9C1AA2C3B090DF \
  -Dtests.class=org.elasticsearch.xpack.ml.logstructurefinder.LogStructureFinderManagerTests \
  -Dtests.method="testFindCharsetGivenBinary" \
  -Dtests.security.manager=true \
  -Dtests.locale=id \
  -Dtests.timezone=America/Punta_Arenas \
  -Dcompiler.java=10 \
  -Druntime.java=8

Error:

ERROR   0.31s | LogStructureFinderManagerTests.testFindCharsetGivenBinary <<< FAILURES!
   > Throwable #1: java.lang.UnsupportedOperationException
   >    at __randomizedtesting.SeedInfo.seed([BB9C1AA2C3B090DF:4A943652575C61C1]:0)
   >    at sun.nio.cs.ext.ISO2022_CN.newEncoder(ISO2022_CN.java:76)
   >    at java.lang.StringCoding$StringEncoder.<init>(StringCoding.java:282)
   >    at java.lang.StringCoding$StringEncoder.<init>(StringCoding.java:273)
   >    at java.lang.StringCoding.encode(StringCoding.java:338)
   >    at java.lang.String.getBytes(String.java:918)
   >    at org.elasticsearch.xpack.ml.logstructurefinder.LogStructureFinderManager.findCharset(LogStructureFinderManager.java:166)
   >    at org.elasticsearch.xpack.ml.logstructurefinder.LogStructureFinderManagerTests.testFindCharsetGivenBinary(LogStructureFinderManagerTests.java:42)
   >    at java.lang.Thread.run(Thread.java:748)
@cbuescher cbuescher added >test-failure Triaged test failures from CI :ml Machine learning labels Aug 29, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@droberts195 droberts195 self-assigned this Aug 29, 2018
droberts195 added a commit to droberts195/elasticsearch that referenced this issue Aug 29, 2018
Some character sets cannot be encoded and this was tripping
up the binary data check in the ML log structure character
set finder.

The fix is to assume that if ICU4J identifies that some bytes
correspond to a character set that cannot be encoded and those
bytes contain zeroes then the data is binary rather than text.

Fixes elastic#33227
droberts195 added a commit that referenced this issue Aug 29, 2018
Some character sets cannot be encoded and this was tripping
up the binary data check in the ML log structure character
set finder.

The fix is to assume that if ICU4J identifies that some bytes
correspond to a character set that cannot be encoded and those
bytes contain zeroes then the data is binary rather than text.

Fixes #33227
droberts195 added a commit that referenced this issue Aug 29, 2018
Some character sets cannot be encoded and this was tripping
up the binary data check in the ML log structure character
set finder.

The fix is to assume that if ICU4J identifies that some bytes
correspond to a character set that cannot be encoded and those
bytes contain zeroes then the data is binary rather than text.

Fixes #33227
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants