-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unicode: unicode.Is and bytes.Buffer.WriteRune get confused by negative runes #43254
Comments
Change https://golang.org/cl/280493 mentions this issue: |
Change https://golang.org/cl/280492 mentions this issue: |
If you all agree that the functions should handle this case, I uploaded some CLs to fix the instances I found. |
Is and isExcludingLatin did not handle negative runes when dispatching to is16. TestNegativeRune covers this along with the existing uint32 casts in IsGraphic, etc. (For tests, I picked the smallest non-Latin-1 code point in each range.) Updates #43254 Change-Id: I17261b91f0d2b5b5125d19219411b45c480df74f Reviewed-on: https://go-review.googlesource.com/c/go/+/280493 Run-TryBot: Rob Pike <r@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Rob Pike <r@golang.org> Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Updates #43254 Change-Id: I7d4bf3b99cc36ca2156af5bb01a1c595419d1d3c Reviewed-on: https://go-review.googlesource.com/c/go/+/280492 Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com> Reviewed-by: Rob Pike <r@golang.org> Trust: Emmanuel Odeke <emmanuel@orijtech.com> Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com> TryBot-Result: Go Bot <gobot@golang.org>
Thank you @davidben for the report and for mailing the CLs, I’ve just merged both: shall we then close this issue? |
We might also need to add release notes. I’ll mark the CLs as requiring them. |
Thanks! |
Change https://golang.org/cl/317469 mentions this issue: |
CL 317273 accidentally grouped a fix for bufio, bytes, strings packages into a single entry, but they should be separate ones. Fix that, and document these negative rune handling fixes. The list of fixed functions in package unicode was computed by taking the functions covered by the new TestNegativeRunes test, and including those that fail when tested with Go 1.16.3. For #44513. Updates #43254. Change-Id: I6f387327f83ae52543526dbdcdd0bb5775c678bd Reviewed-on: https://go-review.googlesource.com/c/go/+/317469 Reviewed-by: David Benjamin <davidben@google.com> Reviewed-by: Alexander Rakoczy <alex@golang.org> Trust: Alexander Rakoczy <alex@golang.org> Trust: Dmitri Shuralyov <dmitshur@golang.org> Run-TryBot: Alexander Rakoczy <alex@golang.org>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
https://play.golang.org/p/9ZkvjGuE1so
What did you expect to see?
unicode.Is
and related functions should return false on negative values, as they do for other invalid runes.bytes.Buffer.WriteRune
should write a replacement character, as it does for other invalid runes. In particular, a UTF-32 decoder could easily construct a negative rune before checking. (Looks like x/text/encoding/unicode/utf32 does that and then relies on RuneLen noticing.)What did you see instead?
unicode.Is
thinks some negative values are printable, andbytes.Buffer.WriteRune
accidentally runs a single-byte fast path.I wasn't sure at first whether this was a bug, but most functions seem to check for negative values or cast to
uint32
, so I think these should as well. (If they expect the rune be a real code point, that should probably be documented.)The text was updated successfully, but these errors were encountered: