-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
string.printable.isprintable() returns False #67206
Comments
string.printable includes all whitespace characters. However, the only whitespace character that is printable is the space (0x20). By definition, the only ASCII characters considered printable are: Source: 7.2 POSIX Locale Conforming systems shall provide a POSIX locale, also known as the C locale. 7.3.1 LC_CTYPE space
cntrl
graph
print
LC_CTYPE Category in the POSIX Locale # "print" is by default "alnum", "punct", and the <space> |
Here is a simple fix for the issue, plus a test. |
This is a bit of a conundrum. Our (string module) definition of printable is very clear, and it includes the other whitespace characters. We could document that this does not match the posix definition of printable. It also does not match the RFC 5822 definition of printable (for example), which does *not* include whitespace characters (not even space), but the posix definition is a more likely source of confusion. isprintable is a newer function than string.printable, and serves a different purpose. I suppose that when PEP-3138 was written and implemented the disconnect between the two definitions was not noticed. For backward compatibility reasons I suspect we are stuck with the discrepancy, but perhaps others will think it worth the pain of changing string.printable. I kind of doubt it, though. |
C standard defines locale-specific *printing characters* that are [ -~] There is isgraph() function that returns zero for the space but POSIX definition is aligned with the ISO C standard. I don't know what RFC 5822 has to do with this issue but the rfc Tests from bpo-9770 show the relation between C character classes and set(string.printable) == set(C['graph']) + set(C['space']) where C['space'] is '\t\n\v\f\r ' (the standard C whitespace). It is a documented behavior [2]: This is a combination of digits, ascii_letters, punctuation, where *whitespace* is C['space']. In Python 2, *printable* is locale-dependent and it coincides with the Unlike other string constants, *printable* differs from C['print'] on str.isprintable [3] obeys C['print'] (in ASCII range) and considers SP --- It might be too late to change string.printable to correspond to C I've uploaded a documentation patch that mentions that string.printable [1] http://bugs.python.org/review/9770/diff/12212/Lib/test/test_curses_ascii.py |
Reproduced on 3.11. |
Still reproducible on main. |
I agree that it would be a bad idea to change |
…the POSIX sense (pythonGH-128820) (cherry picked from commit d906bde) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
…the POSIX sense (pythonGH-128820) (cherry picked from commit d906bde) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
…the POSIX sense (pythonGH-128820) (cherry picked from commit d906bde) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
We documented the behaviour (and backported those notes to 3.12 and 3.13), hence I will close this issue as |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
string.printable
is not printable in the POSIX sense #128820string.printable
is not printable in the POSIX sense (GH-128820) #128867string.printable
is not printable in the POSIX sense (GH-128820) #128868The text was updated successfully, but these errors were encountered: