You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I see the problem. The StringIO#ungetbyte is treating the byte as a character rather than raw bytes. Since we're commonly working in UTF8 this single byte is converted into a multibyte UTF8 encoding, and appended to the front of the string.
To illustrate this consider the string "\u01A9". This is encoded into the byte sequence 0xC60xA9. When a byte is read we get 0xc6, but if we try to unget that byte we append 0xC30x86 to the start of the string, because that's the UTF8 encoding of \u00C6.
This particular piece of our library looks like it can be simplified considerably as ungetbyte will only accept a single number, and will mask it to be a single byte.
This results in several tests failing in my project.
The text was updated successfully, but these errors were encountered: