Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-fonts] Specify what generic font family maps to nastaliq #4397

Closed
hsivonen opened this issue Oct 4, 2019 · 37 comments
Closed

[css-fonts] Specify what generic font family maps to nastaliq #4397

hsivonen opened this issue Oct 4, 2019 · 37 comments
Labels
Closed as Question Answered Used when the issue is more of a question than a problem, and it's been answered. css-fonts-4 Current Work i18n-alreq Arabic language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Needs Testcase (WPT)

Comments

@hsivonen
Copy link
Member

hsivonen commented Oct 4, 2019

The section Generic font families does not say how to generically request a nastaliq font for the Arabic script. Anecdotally, the distinction of having a specifically nastaliq font is important enough in the Urdu context that Microsoft and Google bundle a nastaliq font with their operating systems for Urdu.

It seems to me that it would be good to explicitly specify which generic keyword means nastaliq, either by specifying that the cursive family means nastaliq for the Arabic script or by minting a new keyword for as was done for fangsong. (I don't know which option is more appropriate, but I'm guessing that using the existing cursive keyword should be sufficient.)

@frivoal frivoal added i18n-alreq Arabic language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. labels Oct 9, 2019
@frivoal
Copy link
Collaborator

frivoal commented Oct 9, 2019

I wonder whether there's much preexisting use of the cursive value for Arabic script content, and if so, whether authors of pages that use that have been expecting a nastiliq font already, or something else.

Another consideration: when nasatilq text contains bits of text in non Arabic scripts, what sort of fonts do they typically use? Certainly authors could select and style these separately, but if they don't it's interesting to know whether a cursive Latin / Chinese / ??? font would generally be a good match or not.

Relatedly, If we end up not using cursive and minting a new keyword, we should also give some thoughts to what sort of fonts that keyword ought to be mapping to for non-Arabic text.

@tabatkins
Copy link
Member

@drott @r12a ?

@fantasai
Copy link
Collaborator

@behdad

@heycam
Copy link
Contributor

heycam commented Oct 16, 2019

@jfkthame ?

@litherum
Copy link
Contributor

I discussed this with some of my colleagues internally, and we would prefer a new keyword. Following the model of fangsong sounds like a good path forward.

@svgeesus
Copy link
Contributor

svgeesus commented Oct 16, 2019

@shervinafshar @ntounsi @behnam your input on a new CSS generic font family nastaliq would be welcome.

Related to w3c/alreq#9

@svgeesus
Copy link
Contributor

@jfkthame
Copy link
Contributor

I'm not sure creating more generic family names is the best solution here (I have my doubts about fangsong too, TBH, though I'm less familiar with Chinese writing traditions) for something that is so specific to a particular script and language(s). It's completely unclear what these names ought to mean when applied to any content other than the particular script they're targeting.

ISTM it would be appropriate for a UA to map the serif and/or sans-serif generics to Nastaliq faces in the case where the script is Arabic and the content language is tagged as Urdu, or other languages that are known to prefer this style (e.g. Punjabi, Saraiki, Balochi, Pashto -- the latter two perhaps only when the region is Pakistan, I'm not sure of the conventions in Iran or Afghanistan).

Mapping cursive to a Nastaliq face might be useful for languages like Arabic or Persian where the serif and sans-serif generics typically map to simplified Naskh-like faces, although there are other reasonable cursive mappings that could also be considered, like Ruq'ah or Diwani faces. Authors who want full control of the script style should really be providing appropriate webfonts in most cases.

@tabatkins
Copy link
Member

Yeah, fangsong was defined as a separate keyword only because it was an "intermediate" style between two existing styles that already had reasonable mappings to the existing keywords (serif and cursive); there simply wasn't a keyword around that we could map it to.

I'm far from an expert, but it looks like nastaliq isn't in the same position, and we could map it to an existing keyword.

@astearns astearns removed the Agenda+ label Oct 16, 2019
@shervinafshar
Copy link

Although I understand the need for handling Urdu, but I think putting the burden of that challenge on this property is stretching the definition of the property and making it needlessly over-specific; Is Nastaliq a "generic font family" of Arabic script fonts? Possibly, but so are Kufi and Naskh with more immediate and widespread use-cases.

Ultimately, it's up to CSS spec editors to decide whether they are comfortable with this expansion of scope for "generic font family".

In case this is planned to be added, I prefer it to clearly be named as arabic-nastaliq; future-proofing for arabic-naskh and arabic-kufi.

@kojiishi
Copy link
Contributor

kojiishi commented Oct 17, 2019

@jfkthame's comment inspired me an interesting question. So are we positive to accept language/script/writing-system specific generic font families? I would then have 3-5 Japanese keywords that are nice to be added, and I think we should be prepared to requests to add a few tens of such keywords from other scripts too. We might need some guidelines what to accept and what not to.

For what such generic family should do for contents in other scripts, we have learned that aggressive mapping may not be what authors want in the discussion for system-ui-* families. I think we can take the same policy here: do not map and allow authors to define the next fallback font.

@kojiishi
Copy link
Contributor

If we were to map to cursive, I'd like to have a recommendation from i18n/arreq. I see multiple cursive styles in arreq, and not sure which one should be the best. @r12a

@jfkthame
Copy link
Contributor

So are we positive to accept language/script/writing-system specific generic font families?

Personally, I am not at all positive about that. It substantially changes the meaning of "generic" font families, in a direction that I think we'd come to regret.

What I am positive about is browsers using script/language/region information to inform their mapping of the existing generic keywords to appropriate available fonts, so that (for example) serif might map to a Naskh face for Arabic-script characters in content where lang=ar, but a Nastaliq one when lang=ur.

@jfkthame
Copy link
Contributor

ISTM that adding nastaliq as a generic font-family would not be much different from adding further names associated with particular styles of Latin typeface, which authors could reasonably want to use in order to get a certain "look" without knowing the exact names of available fonts. Are we prepared for font-family: humanist or antiqua or egyptian or fraktur? And there are probably equivalent (but quite separate) type design traditions in other scripts that would have an equal claim for recognition in CSS. I don't think we should start down this path.

@Crissov
Copy link
Contributor

Crissov commented Oct 17, 2019

We should never have had <generic-family>: serif | sans-serif | cursive | fantasy | monospace but rather something like:

<generic-family>: <ends> || <stem> || <body> || <eyes> || <form> || <tool> || <mood>
<ends>: serif | sans-serif | slab-serif | egyptienne | swash | flourish | flat
<stem>: antiqua | grotesque | roman | equal | up-stroke | down-stroke | curved | straight | narrow | thick | bulky | outline | hollow | shadow | tall | petite 
<body>: monospace | fixed-width | half-width | full-width | proportional | flexible 
<eyes>: open | closed | filled | small | large | square | round 
<form>: cursive | current | connected | fluid | calligraphic | artistic | rounded | squared | comic | typeset | uncial | broken | blackletter | segmented | geometric | historic | futuristic | fancy | stiff | masculine | feminine | poster | headline | pictographic 
<tool>: nib | pen | pencil | brush | chalk | crayon | stencil | stamp | stylus | typewriter | lcd | chisel | sticker | pixel | finger | foot | paw | body 
<mood>: neutral | angry | soft | sweet | romantic | careful | aggressive | loud | playful | funny | sarcastic | ironic | depressive | happy | wild | tame | serious | mechanic | shocked | frightened | dreamy | tired | explicit | poetic | childish 

… with saner values, of course. I brainstormed these just to illustrate the point.

@litherum
Copy link
Contributor

litherum commented Oct 17, 2019

@kojiishi

We might need some guidelines what to accept and what not to.

How about, for a start:

  • Multiple fonts exist that could be mapped to it
  • The major operating systems each contain at least one font that could be mapped to it
  • Demonstrated user desire for it
  • ??? More here?

@kojiishi
Copy link
Contributor

@litherum looks great to me, thank you.

How about also requiring some authorities for i18n families? My thinking is like Unicode requires new code points being authorized by country standards. Maybe i18n wg can do this for us? @r12a

@jfkthame
Copy link
Contributor

We should never have had <generic-family>: serif | sans-serif | cursive | fantasy | monospace but rather something like: [....]

This basically looks like an attempt to reinvent PANOSE. I don't think that's ever really caught on, has it? (And I'm not convinced it belongs in CSS. For authors/designers who want that degree of control, @font-face is the answer.)

@Crissov
Copy link
Contributor

Crissov commented Oct 18, 2019

@jfkthame My point was that the current generic font families are not mutually exclusive semantic-wise. Something simple, e. g. [serif | sans-serif] || [cursive | monospace] || fantasy using the same keywords, would have been a good start to actually get useful fallback fonts.

PS: Robert Stevahn: Panose on the Web (W3C,1996), CSS 2.0 @font-face {panose-1: 0 0 0 0 0 0 0 0 0 0 /*<integer>{10}*/;}, CSS Webfonts module 2002.

@kojiishi
Copy link
Contributor

+cc @himorin

@hsivonen
Copy link
Member Author

@shervinafshar

Possibly, but so are Kufi and Naskh with more immediate and widespread use-cases.

When filing this, I assumed it to be non-controversial that CSS sans-serif and serif would map to Kufi and Naskh, respectively. (The spec says this explicitly for Naskh but not for Kufi.)

@jfkthame

ISTM it would be appropriate for a UA to map the serif and/or sans-serif generics to Nastaliq faces in the case where the script is Arabic and the content language is tagged as Urdu, or other languages that are known to prefer this style (e.g. Punjabi, Saraiki, Balochi, Pashto -- the latter two perhaps only when the region is Pakistan, I'm not sure of the conventions in Iran or Afghanistan).

That may well be better, and the phrasing of this issue as filed is over-constrained by suggesting that a generic font family be the mechanism.

The background of filing this issue was that I heard an anecdote (and believed it considering that Windows 10 treats "Arabic" and "Arabic (Nastaliq variant)" as different font management groups) in Noto PR video that some Urdu sites preferred bitmapping their text over having a browser use its usual Arabic font. That seemed like an unfortunate situation, so I checked what CSS says about getting Nastaliq (other than @font-face), and saw it said nothing. (So that's the "see something, file something" level of this issue.)

I am not suggesting standardizing keywords for all Arabic-script font styles or font styles generally. However, Windows 10 treating, in font management UI, "Arabic" and "Arabic (Nastaliq variant)" on the same level of distinction as "Chinese (Simplified)" and "Chinese (Traditional)" suggested that Nastaliq has special status compared to font styles in general, so it seemed worthwhile for CSS to say how to get it generically. I have no personal experience to evaluate this further.

@tabatkins
Copy link
Member

Yeah, when there are reasonably obvious-to-domain-experts mappings for a given language that might not be obvious to spec editors and implementors (most of whom speak Latin-alphabet languages), I think it's reasonable to talk about that in the spec.

@r12a
Copy link
Contributor

r12a commented Oct 28, 2019

[Sorry for the delayed response due to catchup after travel.]

I'm also not clear yet what the answer is here, although if i'd intervened earlier i would probably have said a number of things that @jfkthame mentioned already.

The following are some random thoughts that come to mind, without yet an overarching system to bring them together:

  1. Istm that Kufi is a font style with specific uses, and not one you'd normally use to map ordinary text to sans-serif, despite the fact that it’s certainly a simpler kind of font style than naskh.
  2. It may even make more sense to map a traditional naskh font (such as Traditional Arabic) to serif, and a more modern font (such as Tahoma) to sans-serif, as the former is more ornate than the latter, but both are in general use.
  3. However, there are also several other styles for standard Arabic. Btw, Ruq'a style fonts might also be a contender for a sans-serif font, but perhaps only in some circumstances, because the behaviour of the font is significantly different from Naskh (no kashidas used for justification).
  4. If we had a special category for nastaliq, users writing arabic in Nigeria/Mali may equally want to distinguish between ordinary maghribi-naskh styles and the distinctive African Kano style (see the example text for Hausa ajami). It's not clear to me how that would be different from the distinction between naskh and nasta’liq.
  5. Similar font styles exist in other languages, such as Khmer, which tends to distinguish between upright, slanted, and mool styles. The font you fall back to would presumably be a better fit if it matched the font style that would have been used, rather than being something completely different.
  6. I’d have thought that, for Urdu and Kashmiri, where nastaliq is clearly the preferred font style for normal text, it would be better to fall back to another nastaliq font on your system, if there is one, than to fall back to, say, a naskh font. This might be handled fairly straightforwardly for Urdu/Kashmiri languages by setting the fallback per language (as @jfkthame suggests), but what about languages where there isn’t a strong preference one way or the other? For example, Persian. If you had been using a nastaliq font for some Persian poetry which isn’t available on the system, but another nastaliq font is available, it may be better to use that rather than to fall back on, say, Tahoma. Same may be true for African ajami text in the Kano style.

The section in the fonts spec doesn’t seem to do a good job of describing the rules to be applied in order to sort fonts into the various generic styles. Sometimes form is paramount, other times function, or sometimes a slightly unconvincing and westernised attempt to mix the two. I’m inclined to think that there is a fairly high degree of arbitrariness involved in trying to fit non-Latin styles into the existing categories.

I seem to be persuading myself that having a longer and extendable list of font styles for international text would be better than trying to squeeze new font styles into the current pigeon holes. (That said, i don’t think we want to have anything like the long list produced by @Crissov.)

@behdad
Copy link

behdad commented Oct 28, 2019

ISTM that adding nastaliq as a generic font-family would not be much different from adding further names associated with particular styles of Latin typeface, which authors could reasonably want to use in order to get a certain "look" without knowing the exact names of available fonts. Are we prepared for font-family: humanist or antiqua or egyptian or fraktur? And there are probably equivalent (but quite separate) type design traditions in other scripts that would have an equal claim for recognition in CSS. I don't think we should start down this path.

Exactly this.

@jfkthame
Copy link
Contributor

It's arguably rather unfortunate that we have enshrined serif and sans-serif in CSS, given that they're terms pretty strongly associated with Latin-script typographic practice (along with its close cousins such as Cyrillic and Greek), but not really meaningful for many other of the world's writing systems.

Perhaps it would have been better to have terms such as formal, informal, ornate, simplified, archaic (the exact set of values being subject to bikeshedding, naturally) that browsers could map to appropriate typographic styles for each script, without trying to shoehorn scripts with totally unrelated design traditions into terminology specific to Latin typography.

But that ship sailed long ago, I suppose. Unless we're prepared to consider deprecating the existing generics and introducing a new, parallel set of more script-agnostic values?

@tabatkins
Copy link
Member

Unless we're prepared to consider deprecating the existing generics and introducing a new, parallel set of more script-agnostic values?

Yeah, that's almost certainly a non-starter. ^_^

@r12a
Copy link
Contributor

r12a commented Oct 29, 2019

It's arguably rather unfortunate that we have enshrined serif and sans-serif in CSS, given that they're terms pretty strongly associated with Latin-script typographic practice (along with its close cousins such as Cyrillic and Greek), but not really meaningful for many other of the world's writing systems.
...
But that ship sailed long ago, I suppose. Unless we're prepared to consider deprecating the existing generics and introducing a new, parallel set of more script-agnostic values?

The point i'm making is that these are probably fine to keep for Latin/Cyrillic/Greek/etc, because they represent useful alternate styles related to those scripts/languages. But we should recognise and treat them as representative of only a certain number of scripts/languages, and add the ability to indicate the alternative font styles needed for other scripts/languages.

I'm not sure there's one set of styles that (eg. formal, ornate, etc.) that works for all scripts/languages. At some point we'll run into the same problem, just with a different set of labels. For example, how would one classify the Khmer styles, or the African ajami styles mentioned above into the buckets listed 2 comments earlier?

@frivoal
Copy link
Collaborator

frivoal commented Oct 29, 2019

I think we should revisit this once we decide on #4442, as until then it isn't entirely clear what it actually means for a name to be declared a generic font family. That said, it seems to me that the direction we're going in there is to say that the generic font families aren't actually special in terms of behavior with regards to how they match (or not) and how they fallback. In that case, they are just commonly accepted names for generic concepts, and nothing bad or unusual happens if a browser fails to support some of them (since authors can just supply fallbacks, whether other generic families, or named local fonts, or web fonts).

If that's the case, I think we should start a separate document, outside of css-fonts, maintained as a registry rather than as a spec, where we list a larger set of generic font families than had been accepted so far, without requiring browsers to implement the whole set. I'd expect aditions to the list to be mainly diven by i18n needs, and it could list things like fangsong, nastaliq, Ruq'a, or Mool, but could also have things like humanist or fraktur, or system-ui-*, or be a honorable retirement place for fantasy. I don't suggest listing everything we can think of there, as there would be no end to that list, but we could list any generic name actually implemented by a User Agents.

This would:

  • allow UAs to support new generic family names for which they see meaningful demand (without being accused of making proprietary extensions)
  • let UAs that don't agree on the need for that family to keep ignoring it (without being accused of violating the spec)
  • help UAs to coordinate with each-other on the naming and meaning of new values when they do agree there's a need
  • give authors a centralized place to discover these values and their meanings.

@Crissov
Copy link
Contributor

Crissov commented Oct 30, 2019

We could also introduce new properties to css-fonts that would influence the font selection (or configuration) like font-weight, font-variant, font-style and font-stretch already do. font-family would always have top priority, but if it is resolved to a generic font keyword, because all typefaces preceding it are unavailable, it becomes a pseudo shorthand setting one or more other properties. We would need something like …

  • font-serifness for serif and sans-serif,
  • font-width for monospace, which could alternatively become part of either font-stretch or font-variant-width,
  • font-flow for cursive, which could alternatively become part of font-style,
  • font-hand for nastaliq, fangsong, fraktur etc.,
  • font-mood for fantasy.

Some of those would indeed match nicely with one or more of the PANOSE 1 or 2 (Intellifont) digits. Ideally, authors could set font-family: auto and specify font requirements with the other properties to get closely matching fonts for everything everywhere.

This might not seem all that useful for Helvetica/Arial/sans-serif font stacks in a LCR-only setting, but it would enable better automatic fallbacks for unavailable webfonts and for scripts not explicitly supported by the font stack specified in the stylesheet.

PS: The panose-1 descriptor for @font-face was a mistake and its removal was the right thing to do. The PANOSE value of a font, if not taken from the file itself and if used at all, should be calculated from the other descriptors.

@AmeliaBR
Copy link
Contributor

If there is a decision to support a wider array of "generic" stylistic terms, we should probably start talking about using a functional notation (like style(nastaliq) or similar). Every new generic keyword risks a compat problem with existing content that used the same keyword as an unquoted font-family name.

@svgeesus
Copy link
Contributor

svgeesus commented Dec 5, 2019

With the recent resolution of issue: 4442 Don't require browsers to always match every generic font family to a concrete font family, it is now clearer that

a) the other (new) generic font families may not map to at one matched face
b) even if they do, there may be no match for a given codepoint.

@c933103
Copy link

c933103 commented Dec 16, 2019

Let me repeat my opinion here, I don't think it is a good idea to specify nastaliq or fraktur as new font family. They are specific ways to write their script with different regional preference on their usage, unlike regular font family which is supposed to be accessible to everyone who use the script. They have also been identified as different scripts in the ISO 15924 standard, and thus it is already possible to specify displaying content in these different writing variants through script subtag in language tagging, or user custom locale selection. Whether those platforms actually support them yet is another question, but support for similar variants have already been implemented on various platform, including the use of different fonts for same character with same Unicode code point on Japanese/Korean/Traditional Chinese/Simplified Chinese, or with Russian/Macedonian Italic Cyrillic glyph, or with Mathematics/Greek character beta. It is also possible, as have been done on some tools for Chinese/Korean/Japanese, to appoint different fonts based on different language/script combination, so that same string of character could display with different user-selected fonts under different language-script settings, displaying different writing style.
Another thing to consider is that, when there are enough font for e.g. Nastaliq on the market, undoubtedly fontmakers will also start making those fonts in different looks and the different among themselves will more closely resemble the difference between traditional font families in css. If nastaliq is to be specified and used as a font family then it would not be possible to specify those different font variants through font family unless additional font families are to be created for such different combinations.

@r12a
Copy link
Contributor

r12a commented Feb 26, 2020

Let me repeat my opinion here, I don't think it is a good idea to specify nastaliq or fraktur as new font family. They are specific ways to write their script with different regional preference on their usage, unlike regular font fanily which is supposed to be accessible to everyone who use the script. They have also been identified as different scripts in the ISO 15924 standard, and thus it is already possible to specify displaying content in these different writing variants through script subtag in language tagging, or user custom locale selection.

Well, yes, it's possible, and lacking other alternatives i have myself resorted to that tactic. But it's not ideal. (a) If you want to change the styling of your document(s) from, say, serif to nastaliq you would probably have to change the markup in all the files using the style sheet rather than just adjust the font-family in the style sheet itself. (b) I'm not confident that people writing documents using nastaliq or other styles would appreciate, or even know to, add the script tag everywhere they use a lang attribute. I certainly don't when writing documents that use the same style throughout for Kashmiri, Hausa, etc. (c) Script subtags don't exist for all the styles that are likely to be needed, such as Mool style in Khmer, or Kano style in African ajami, etc. We'd need to talk with the ISO folks about whether such an approach is something they'd support.

Another thing to consider is that, when there are enough font for e.g. Nastaliq on the market, undoubtedly fontmakers will also start making those fonts in different looks and the different among themselves will more closely resemble the difference between traditional font families in css. If nastaliq is to be specified and used as a font family then it would not be possible to specify those different font variants through font family unless additional font families are to be created for such different combinations.

I'm not sure i agree with you there. These styles we're talking about are generally quite a high level concept, even though the style list may vary from language to language. Serif and sans-serif Latin script fonts have wide variations. Nastaliq already has fonts that vary (eg. Urdu vs Persian). But the point here it seems to me is that the system would choose one font that you have on your platform to fall back to, and if you're an Urdu user it's likely to be an Urdu font, and if you're Persian a Persian one, then at least that font would maintain the appropriate style.

Which brings me to a suggestion: Browsers currently allow you to specify preferred fonts for particular languages. Why not simply build into the preferences for a browser a list of generic styles and allow users to specify which font they'd like to associate them with. This gives much more power to the user, and makes it much less difficult for the implementer to deal with larger lists of styles+fonts.

@c933103
Copy link

c933103 commented Feb 26, 2020

Thanks for your reply @r12a , after reading about it and reviewing the concept of generic font family in css-fonts module, I have decided to retract my previous comment on this issue.

@svgeesus
Copy link
Contributor

svgeesus commented Oct 5, 2023

So, with recent resolution that script-specific generic fonts may not match to a locally installed font on some systems, it seems that this issue can be solved by introducing generic(nastaliq) which, if it exists, will match to a locally installed font in Nastaliq style.

@svgeesus
Copy link
Contributor

svgeesus commented Oct 5, 2023

@Amelia:

If there is a decision to support a wider array of "generic" stylistic terms, we should probably start talking about using a functional notation (like style(nastaliq) or similar). Every new generic keyword risks a compat problem with existing content that used the same keyword as an unquoted font-family name.

The spec now defines a generic(ident) syntax which will be used for newly-introduced, and especially for script-specific, generics. We now have generic(fangsong) as the first such example.

@svgeesus
Copy link
Contributor

svgeesus commented Nov 7, 2023

Drawing on I18n WG Generic font families I propose to add generic(nastaliq) to resolve this issue. It meets the criteria suggested by @litherum, having multiple fonts, including OS fonts

@r12a @hsivonen

@svgeesus svgeesus added the Closed as Question Answered Used when the issue is more of a question than a problem, and it's been answered. label Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closed as Question Answered Used when the issue is more of a question than a problem, and it's been answered. css-fonts-4 Current Work i18n-alreq Arabic language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Needs Testcase (WPT)
Projects
None yet
Development

No branches or pull requests