-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need some way to parse &[u8] as UTF-8 with replacement chars #9516
Comments
Could we remove the current condition API in favour of this? It doesn't really seem to offer anything but failing, and we already have an API returning an |
@thestinger I would be quite pleased if we could do that. Using the condition is rather awkward, and I would rather just have replacement chars myself. |
This is a duplicate of #8968, no? |
And also the e-mail thread here I think is relevant: https://mail.mozilla.org/pipermail/rust-dev/2013-September/005503.html |
This is an implementation of a mostly complete |
@kballard I guess my thinking was that a maximally expressive condition-based solution would let one express a replacement char approach using conditions. But I don't actually favor doing that at this point. Or at least, I don't favor making that the only way to accomplish this, since I suspect a more specialized approach will be much nicer to use (both in terms of programmer convenience and in terms of efficiency). So okay, this is not a duplicate of #8968. |
@blake2-ppc: the fatal mode is already handled by the |
well, the current functions in str don't handle decoding buffers in chunks like the proposed encodings API. But yes, it does handle one-off decoding with 'fatal' error handling. |
I think we need to improve/extend our API along the lines suggested here for 1.0. Nominating for P-backcompat-lang. (Arguably we could avoid the backwards compatibility hazard by offering a fail-only method for 1.0 and then adding an alternative entry point that provides the more flexible API supporting replacement characters in post 1.0. But I think this case is important enough that we should try to get the primary API method to provide both choices up front. That, or put dynamic fluid support in for representing the state of that choice.) |
Rustup r? `@ghost` changelog: none
We need some way to interpret a
&[u8]
as UTF-8, using the replacement character for invalid sequences instead of conditions. This would ideally be provided in the form of anIterator<char>
.The text was updated successfully, but these errors were encountered: