You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When an IO object's encoding is set to Encoding::UTF_16, and then #write is called with a UTF-8-encoded string, the string is written as UTF-8 codepoints in TruffleRuby.
Expected
In CRuby and JRuby the string is transcoded to UTF-16.
Repro
This test passes on CRuby and JRuby, but fails on TR head (v23.0.0-dev-dd8609b3):
#! /usr/bin/env rubyrequire"minitest/spec"require"minitest/autorun"require"stringio"putsRUBY_DESCRIPTIONdescribe"encoding"dolet(:utf8_str){"hello"}# this test passes on all platformsit"UTF-8 string is transcoded correctly by String#encode"doexpected=[254,255,# BOM0,104,0,101,0,108,0,108,0,111,# double-width "hello"]assert_equal(expected,utf8_str.encode(Encoding::UTF_16).bytes)end# this test fails on TruffleRUbydescribe"given an IO with UTF-16 encoding"dolet(:io){StringIO.new.set_encoding(Encoding::UTF_16)}it"#write accepts a UTF-8-encoded string and transcodes it"doio.write(utf8_str)result=io.stringexpected=[254,255,# BOM0,104,0,101,0,108,0,108,0,111,# double-width "hello"]assert_equal(Encoding::UTF_16,io.external_encoding)assert_equal(expected,result.bytes)endendend
The failure is:
1) Failure:
encoding::given an IO with UTF-16 encoding#test_0001_#write accepts a UTF-8-encoded string and transcodes it [./repro-truffle-encoding.rb:36]:
--- expected
+++ actual
@@ -1 +1 @@
-[254, 255, 0, 104, 0, 101, 0, 108, 0, 108, 0, 111]
+[104, 101, 108, 108, 111]
The text was updated successfully, but these errors were encountered:
StringIO has a weird defined notion of encoding/transcoding, e.g., it accepts but seemingly ignores it in the constructor (e.g. #2793).
Thanks for the report, looks like there should still be some transcoding.
eregon
changed the title
IO#write does not transcode strings like CRuby and JRubyStringIO#write does not transcode strings like CRuby and JRuby
Jan 23, 2023
Observed
When an IO object's encoding is set to
Encoding::UTF_16
, and then#write
is called with aUTF-8
-encoded string, the string is written as UTF-8 codepoints in TruffleRuby.Expected
In CRuby and JRuby the string is transcoded to UTF-16.
Repro
This test passes on CRuby and JRuby, but fails on TR head (
v23.0.0-dev-dd8609b3
):The failure is:
The text was updated successfully, but these errors were encountered: