Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the given string is not compatible to the expected encoding "US_ASCII" #2749

Closed
gogainda opened this issue Oct 1, 2022 · 1 comment
Closed
Assignees
Labels

Comments

@gogainda
Copy link
Contributor

gogainda commented Oct 1, 2022

given code:

puts      [49, 46, 50].pack('c*').encode("us-ascii")
          .encode(xml: :text)
          .gsub("\u{000D}", '
') # Carriage Return
          .gsub("\u{000A}", '
') # Line Feed
          .gsub("\u{0085}", '…') # Next Line
          .gsub("\u{2028}", '
') #

fails with:

the given string is not compatible to the expected encoding "US_ASCII", did you forget to convert it? (java.lang.IllegalArgumentException)
	from com.oracle.truffle.api.strings.InternalErrors.wrongEncoding(InternalErrors.java:91)
	from com.oracle.truffle.api.strings.AbstractTruffleString.looseCheckEncoding(AbstractTruffleString.java:354)
	from com.oracle.truffle.api.strings.TruffleString$ByteIndexOfStringNode.indexOfString(TruffleString.java:3876)
	from com.oracle.truffle.api.strings.TruffleStringFactory$ByteIndexOfStringNodeGen.execute(TruffleStringFactory.java:3488)
	from com.oracle.truffle.api.strings.TruffleString$ByteIndexOfStringNode.execute(TruffleString.java:3841)
	from org.truffleruby.core.string.StringNodes$StringByteIndexNode.stringByteIndex(StringNodes.java:3878)
	from org.truffleruby.core.string.StringNodesFactory$StringByteIndexNodeFactory$StringByteIndexNodeGen.execute(StringNodesFactory.java:12579)
	from org.truffleruby.language.control.SequenceNode.execute(SequenceNode.java:37)
	from org.truffleruby.language.control.IfElseNode.execute(IfElseNode.java:45)
	from org.truffleruby.language.control.IfElseNode.execute(IfElseNode.java:45)
	from org.truffleruby.language.control.SequenceNode.execute(SequenceNode.java:37)
	from org.truffleruby.language.RubyMethodRootNode.execute(RubyMethodRootNode.java:65)
<internal:core> core/truffle/string_operations.rb:288:in `byte_index'
	from <internal:core> core/truffle/string_operations.rb:108:in `gsub_string_matches'
	from <internal:core> core/truffle/string_operations.rb:74:in `gsub_internal_matches'
	from <internal:core> core/string.rb:850:in `gsub'
	from test-str.rb:1:in `<main>'
@eregon
Copy link
Member

eregon commented Oct 3, 2022

Thanks for the report.
Here is a slightly clearer repro:

s = [49, 46, 50].pack('c*').encode("us-ascii")
s = s.encode(xml: :text)
s = s.gsub("\u{000D}", '&#xD;') # OK
s = s.gsub("\u{000A}", '&#xA;') # OK
s = s.gsub("\u{0085}", '&#x85;') # fails
s = s.gsub("\u{2028}", '&#x2028;') # fails
puts s

It fails on line 5 and 6.

@eregon eregon self-assigned this Oct 3, 2022
@eregon eregon added the bug label Oct 3, 2022
graalvmbot pushed a commit that referenced this issue Oct 5, 2022
graalvmbot pushed a commit that referenced this issue Oct 5, 2022
graalvmbot pushed a commit that referenced this issue Oct 8, 2022
…dex} (#2749)

PullRequest: truffleruby/3504
(cherry picked from commit bf6272d)
eregon added a commit to ruby/spec that referenced this issue Nov 7, 2022
john-heinnickel pushed a commit to thermofisher-jch/truffleruby that referenced this issue Aug 16, 2023
john-heinnickel pushed a commit to thermofisher-jch/truffleruby that referenced this issue Aug 16, 2023
seven1m pushed a commit to seven1m/ruby_spec that referenced this issue Sep 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants