Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading from and writing to the Windows console should be lossy #116871

Open
ChrisDenton opened this issue Oct 18, 2023 · 2 comments · May be fixed by #134534
Open

Reading from and writing to the Windows console should be lossy #116871

ChrisDenton opened this issue Oct 18, 2023 · 2 comments · May be fixed by #134534
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` O-windows Operating system: Windows T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@ChrisDenton
Copy link
Member

The Windows console works with text whereas our Read and Write traits deal with arbitrary bytes. We currently use the Unicode Windows APIs and assume the bytes are in fact UTF-8. If they're not then we return an error. Similarly we expect that the input from the console will be actual Unicode text (I'd be surprised if it ever wasn't but I guess it's possible) and again we error if it's not. See here, here and here.

Seeing as the console is a text interface, rather than a binary one, why not lossily replace any invalid Unicode to the replacement character?

@ChrisDenton ChrisDenton added O-windows Operating system: Windows T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` labels Oct 18, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 18, 2023
@ChrisDenton ChrisDenton removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 18, 2023
@ChrisDenton
Copy link
Member Author

ChrisDenton commented Oct 25, 2023

This was very briefly touched on in a recent libs api meeting though it wasn't the main topic of conversation.

One issue that was brought up was that maybe we do want to return an error in order to help catch programmer mistakes. I think if that's true then this shouldn't be a Windows-only thing; it should be a check available when writing to a tty on all platforms.

There's no particular reason to treat Windows as special here. In a glorious UTF-8 everywhere future (coming soon) we'd just use the normal WriteFile function for writing bytes to the console (just like a normal file), rather than converting to UTF-16 and using WriteConsoleW. They would be no reason to do a UTF-8 check because this is handled by the console (almost certainly by doing the replacement thing for us but it could error if it wants to).

@QuineDot
Copy link

I think if that's true then this shouldn't be a Windows-only thing; it should be a check available when writing to a tty on all platforms.

Don't do that on all platforms without introducing a new type or introducing new methods and macros to do so. I write OsString to StdOut (as bytes) even when it's a tty in order to print out what the user has passed in losslessly. And yes, I work on platforms where the default locale and definitely the filenames are not UTF8.

In a glorious UTF-8 everywhere future (coming soon)

I don't see unix disallowing the scenario above ever, really, but if happens I imagine it's decades off.

@ChrisDenton ChrisDenton linked a pull request Dec 19, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` O-windows Operating system: Windows T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants