-
Notifications
You must be signed in to change notification settings - Fork 812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Add the Control Pictures Unicode block #219
Comments
@aaronbell Thanks for looking into the control pictures. I think you'll have an artistic and practical decision to make with those, balancing your font visual identity, readability, and familiarity. The diagonal sets of letters seems to be the most common representation and the one used by the Unicode consortium: https://unicode.org/charts/PDF/U2400.pdf and Segoe UI Symbol. Visual Studio Code, on the other hand uses horizontal sets of letters: I'm not sure which would be easiest to read in a Terminal app at small sizes. Alternatively, it seems square graphic symbols were once standardized for all of these, but not commonly used anymore (probably because the C0 control characters were originally designed with punched cards and serial terminals in mind). I think if Windows Terminal allowed me to set different fonts for specific Unicode ranges I would try to get these legacy graphic symbols in a font by themselves and use those, they seem more readable at my typical Terminal font size and I even though I never encountered these graphic representations before, I could get used to them easily enough. |
Thanks for the info! I had actually originally designed the control characters to be full height 1/3 width but found that they were really difficult to read, especially when you have a bunch in a row (like your original sample image). As such, the diagonal form actually makes a lot of sense. I think, though, that I’ll continue to experiment with making them a bit more bigger (wider maybe?) to see what kind of results I can get. The graphical form versions is a good idea too! Interestingly, many of those forms are already present in the geometric shapes codepage: http://www.unicode.org/charts/PDF/U25A0.pdf so I think it would be relatively straightforward to bring them in. I think it would make the most sense to add them as a stylistic set of the base versions so folks like you who want to give them have that option available :) |
Here's the background story on LF, NL and NEL. U+2400 to U+241F are pictures of all the C0 control characters from ASCII, mapped directly from their corresponding U+0000...U+001F control characters. C1 control characters, which NEL is from, are a whole other set of codes only available in ANSI (extended ASCII / high-ASCII) and Unicode, and as far as I know those have no representation in the control pictures range. My understanding is that all the remaining pictures ␢ ␣  ␥ ␦ in U+2422 to U+2426 are general-purpose representation of hidden characters for CUI apps that wish to show simpler symbols, for example for a text editor aimed at less technical people. "NL" is documented by Unicode as being a symbol for New Line, which isn't the same as the NEL (Next Line) from C1 control characters. NEL (Next Line) on the other hand is a C1 control character for compatibility with IBM mainframe's EBCDIC encoding, which had something similar to LF but not exactly, and therefore needed a separate control character for text exchange. |
Wow, I'm really liking how these are shaping up here in #219 (comment). You're right, they felt a little cramped in the first pass you made 😄 |
Thanks @DHowett! Do we dare put the graphical versions as default? :D |
Aw, alas. I use them for debugging Terminal, so I would need to learn a whole new language if we do that! 😁 I'm not against it, for sure. haha |
I’ll give you a sample version to mull over ;) |
@aaronbell @DHowett Hey! no fair! why the microsofties-only version?! 😭 Looking at #219 (comment), it seems 2-letters symbols are going to be more readable than 3-letter ones. (I still like the graphics symbols, but understand many users might not want to have to learn another set of symbols.) Since the 2-letter abbreviations for C0 codes are less common but nonetheless standardized (https://en.wikipedia.org/wiki/ISO_2047), what about a variant that uses the 2-letter versions for the whole set? This would make their size more consistent than a mix of 3-letter and 2-letter, and probably would help readability at small sizes. Users used to the more common 3-letter versions can probably figure out the shortened ones without too much trouble. |
@PhMajerus Don't worry, I won't exclude you :). |
@PhMajerus @DHowett Alright, sorry for the delay :) Here's a demo version of the font, named Cascadia CTRL: A couple of notes:
IIRC, Windows Terminal (and I think VSCode) let you set stylistic sets. Give it a try and see what you think! |
@aaronbell Thanks for doing all 3 variants, they all look great and I feel each have their benefits. I'm really curious to try the all 2 letters variant in Terminal as the 3 letter ones are a bit too small for me at my usual font size. |
Implementing them would be a tricky prospect, at least in an easily discoverable way. Not every font will offer alternate Stylistic Sets - so unless Cascadia Code is treated as a special case, and extra settings show up when its the chosen font - the only way to implement it would be to allow settings a stylistic set for all fonts that include them. And then, these sets don't include names, and showing the user what these stylistic sets are used for, would be impossible. |
@mdtauk My putting them in stylistic sets is purely for testing purposes—so that they are somewhat accessible for y'all to take a look at. I expect that for the final version, we'll lock to one of the three approaches. |
@PhMajerus - Here's the setup for VSCode at least: microsoft/vscode#80577 |
That is fair enough, but there is little reason not to include stylistic sets for the sake of a more complete typeface - even if Terminal doesn't provide a user facing way to change it |
@aaronbell Ah, sorry, I didn't realize stylistic sets were selected by ligatures options. Thanks for the info. After trying all 3, first they all look great! I really wanted to try to get used to the graphical symbols (ss20), hoping they would be the most readable and faster to scan through once used to them, but testing them mixed with other characters it quickly becomes apparent that they probably only work well when shown in ASCII-only strings, as then the set of other characters present is very limited and does not include any graphical character that could be confused with them. I find the 2 letters (ss19) variant really good for readability, and while it will require some time to feel natural as we're more used the the 2-3 letters, I would probably use the 2 letters one for Terminal. This is probably very dependent on font size and DPI, but testing on both a 1920x1080 monitor at 100% and a Surface Book 2 at 200%, the 3 letters ones are slower to read because they end up less well-defined. This could still change with hinting though. I think using the 2 letters variant as the default could provide better readability and discoverability. |
Thanks for the review @PhMajerus! Your experience aligns pretty well with what I suspected might be the case. The graphical variants are fun, but are difficult to parse in real life scenarios. I would be tempted to leave them there, but I think I have to be honest with myself that the likelihood of anyone using them when they're hidden behind OpenType is quite low—even modern coding / terminal environments don't necessarily support stylistic sets, let alone anything older. Between the 2-letter and mixed settings, it makes sense that the 2 letter variant would render more clearly. With proper hinting they'll perform markedly better with clearer differentiation between the letters, whereas the mixed setting will likely only perform similarly, or slightly better. The problem is that there just aren't sufficient pixels to create definition in the three-stacked form—as you said, folks are likely to switch fonts or increase point size to make them out. For similar reasons as the graphical variants, I think I'd skip providing the mixed setting (or wasting a stylistic set slot on them), and just provide the 2 letter abbreviations. I think folks will be able to get used to it pretty quick. @DHowett What do you think? Would you be open to using the 2 letter variants as default? |
Based on @PhMajerus' screenshots above, I would absolutely be open to using the 2-letter variants as a default. I wish I'd given terminal the ability to choose stylistic sets. I like them all. 😄 I'll kick the tires myself, as well. Thanks for putting this together. |
I love the graphical variants |
This is a significant update to Cascadia Code including a large number of bug fixes as well as updating the font to offer support for Fira Code v5 ligature support. This update supersedes PR #373. Closes #262 - ⏎ added Closes #264 - additional codepoints for control characters added Closes #281 - `!:` and `!.` added Closes #290 - `/\` and `\/` added Closes #301 - `??=` added Closes #324 - ℞ added Closes #327 - `<:>` and other variants implemented via the `calt` refactoring Closes #359 - house added Closes #371 - Added x-height instruction into ttfautohint to control the height of the lowercase. Closes #375 - Completely redesigned quote marks for better recognition Closes #377 - updated hinting to achieve more consistent results Closes #381 - increased height of thetamod Closes #382 - reduced the width of the hooklefts Closes #383 - updated heights on esh, glottalstop, glottalstopreversed Closes #384 - tweaked hinting a little bit. Maybe it'll help :) Closes #386 - added remaining soft-dotting Closes #392 - changed designs of angled quotes (they are now round) Closes #394 - changed former `~=` symbol to a simpler component-based version. Should be less confusing now for Lua / Matlab users. Closes #395 - makes the underline thicker based on font weight Closes #400 - increased size of degree Closes #219 The full control pictures block has been added (u+2400 to u+2426). For purposes of rendering, the two letter abbreviations have been used instead of the standard three letter abbreviations: Additionally, ss20 includes the oft-unused graphical representations of these codepoints (for fun!): Closes #276 (infinite arrows) Full support for Fira Code's current ligature set (with a few exceptions). Now featuring infinite arrows!!! This involved a full refactoring of the `calt` feature—for those interested, it now uses forward-looking substitutions instead of backward-looking substitutions and progressive substitution to reduce code. This also required some redesigning of the greater / lesser related ligatures. Please note, I have also removed all the obsolete ligatures now covered by the arrows code. Closes #329 There was a mismatch in the font's postscript naming conventions that was corrected. Should now render all weights in Word. **Note** there is apparently an additional bug in Mac Word's implementation of variable fonts which should be available in an update mid-Feb. * Not listed – Reworked the hints for the mod and superscript glyphs so that they're bottom-up rather than top-down. This allows for better bottom alignments. Aside from the above changes, this version also includes many other small updates including spacing, outline quality improvements, and fixing hinting.
Hey @aaronbell and @DHowett, sorry for posting in a closed issue, but I thought you might enjoy this, and it provides even more validation for the choice made if anyone happens to read through this thread. Looking at some documentation on the HP 264x terminals series (from the 1970s), I found out they also used the 2-characters representation for control pictures: BTW, after two years with the 2-characters variant, I'm really happy with the readability, and I keep seeing legacy systems where they made the same choice back in the 1970s and 1980s. |
We talked about control characters before, and how they are interpreted by the console or the terminal instead of printing characters.
When working in a terminal, it is sometimes helpful to visualize these in-band control sequences, Visual Studio Code even does it when showing files, by showing small "ESC", "SUB",... when the "Render Control Characters" option is enabled.
Unicode got the same idea, and included a block of Control Pictures (U+2400 to U+2426) in Unicode 12.0. These are designed to be able to represent the control characters on a terminal screen:
␀␁␂␃␄␅␆␇␈␉␊␋␌␍␎␏␐␑␒␓␔␕␖␗␘␙␚␛␜␝␞␟␠␡␢␣␥␦
(https://en.wikipedia.org/wiki/Control_Pictures)
When available, these can be used by CUI apps to provide a visual representation of these special characters, exactly like Visual Studio Code does.
Adding these 39 glyphs would make it possible for utilities such as hexdump to show them in text representations, which is much more helpful than having 34 of the 256 values show up as generic dots. It would even make it possible to a CUI text editor to provide the "Render Control Characters" option.
Windows Terminal currently falls back to another font to render these, but they are tiny and impractical for use in a terminal.
Below is a sample of a hexdump function showing the contents of cmd.exe with high-ascii and control characters (using the font fallback):
And the Ubuntu hexdump command showing the same file, with dots for high-ascii and control characters (so only 96 values out of 256 provide chars representations). (this one is not using Cascadia, but shows the limitation when these characters are not available).
The text was updated successfully, but these errors were encountered: