Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Noto Serif Hentaigana METADATA.pb #8693

Merged
merged 1 commit into from
Dec 6, 2024
Merged

Conversation

nathan-williams
Copy link
Member

Specify primary language for Noto Serif Hentaigana (ja_Hira).

Specify primary language for Noto Serif Hentaigana (ja_Hira).
Copy link

github-actions bot commented Dec 5, 2024

FontBakery report

fontbakery version: 0.13.0a6

Check results

[18] NotoSerifHentaigana[wght].ttf
⚠️ WARN Any CJK font should contain at least a minimal set of 150 CJK characters.
  • ⚠️ WARN

    There are only 2 CJK glyphs when there needs to be at least 150 in order to support the smallest CJK writing system, Kana.
    The following CJK glyphs were found:
    ['uni3099', 'uni309A']
    Please check that these glyphs have the correct unicodes.


    [code: cjk-not-enough-glyphs]
⚠️ WARN Check math signs have the same width.
  • ⚠️ WARN

    The most common width is 559 among a set of 6 math glyphs.
    The following math glyphs have a different width, though:

Width = 310: minus

[code: width-outliers]
⚠️ WARN Combined length of family and style must not exceed 32 characters.
  • ⚠️ WARN

    Name ID 6 'NotoSerifHentaigana-ExtraLight' exceeds 27 characters. This has been found to cause problems with PostScript printers, especially on Mac platforms.


    [code: nameid6-too-long]
⚠️ WARN Check there are no overlapping path segments
  • ⚠️ WARN

    The following glyphs have overlapping path segments:

* u1B0B5 (U+1B0B5): L<<344.0,680.0>--<339.0,700.0>> has the same coordinates as a previous segment.

* he3kana_semivoicedcombkana: L<<344.0,680.0>--<339.0,700.0>> has the same coordinates as a previous segment.

* he3kana_voicedcombkana: L<<344.0,680.0>--<339.0,700.0>> has the same coordinates as a previous segment.

[code: overlapping-path-segments]

⚠️ WARN Validate size, and resolution of article images, and ensure article page has minimum length and includes visual assets.
  • ⚠️ WARN

    Article page is too short!


    [code: length-requirements-not-met]

  • ⚠️ WARN

    Article page lacks visual assets.


    [code: missing-visual-asset]

⚠️ WARN Check for codepoints not covered by METADATA subsets.
  • ⚠️ WARN

    The following codepoints supported by the font are not covered by
    any subsets defined in the font's metadata file, and will never
    be served. You can solve this by either manually adding additional
    subset declarations to METADATA.pb, or by editing the glyphset
    definitions.

  • U+02D8 BREVE: try adding one of: canadian-aboriginal, yi
  • U+02D9 DOT ABOVE: try adding one of: canadian-aboriginal, yi
  • U+02DB OGONEK: try adding one of: canadian-aboriginal, yi
  • U+0302 COMBINING CIRCUMFLEX ACCENT: try adding one of: cherokee, math, coptic, tifinagh
  • U+0306 COMBINING BREVE: try adding one of: old-permic, tifinagh
  • U+0307 COMBINING DOT ABOVE: try adding one of: syriac, tifinagh, canadian-aboriginal, coptic, tai-le, todhri, malayalam, math, old-permic, hebrew, duployan
  • U+030A COMBINING RING ABOVE: try adding one of: syriac, duployan
  • U+030B COMBINING DOUBLE ACUTE ACCENT: try adding one of: cherokee, osage
  • U+030C COMBINING CARON: try adding one of: cherokee, tai-le
  • U+0312 COMBINING TURNED COMMA ABOVE: try adding math 3 more.

Use -F or --full-lists to disable shortening of long lists.

Or you can add the above codepoints to one of the subsets supported by the font: kana-extended, latin, latin-ext, menu

[code: unreachable-subsetting]
⚠️ WARN Shapes languages in all GF glyphsets.
  • ⚠️ WARN

    GF_TransLatin_Arabic glyphset:

WARN messages Languages
Some auxiliary glyphs were missing: Ŀ, ŀ ca_Latn (Catalan)
Some auxiliary glyphs were missing: ſ de_Latn (German) and fr_Latn (French)
Some auxiliary glyphs were missing: Ŧ, ŧ, Ʒ, Ǥ, ǥ, Ǯ, ǯ, ʒ fi_Latn (Finnish)
Some auxiliary glyphs were missing: Ŧ, ŧ nb_Latn (Norwegian Bokmål)
Some auxiliary glyphs were missing: IJ, ij nl_Latn (Dutch)
[code: warning-language-shaping]
⚠️ WARN Ensure dotted circle glyph is present and can attach marks.
  • ⚠️ WARN

    No dotted circle glyph present


    [code: missing-dotted-circle]
⚠️ WARN Ensure soft_dotted characters lose their dot when combined with marks that replace the dot.
  • ⚠️ WARN

    The dot of soft dotted characters used in orthographies must disappear in the following strings: į̀ į́ į̂ į̃ į̄ į̌

The dot of soft dotted characters should disappear in other cases, for example: į̆ į̇ į̈ į̊ į̋ į̒ į̦̀ į̦́ į̦̂ į̦̃ į̦̄ į̦̆ į̦̇ į̦̈ į̦̊ į̦̋ į̦̌ į̦̒ į̧̀ į̧́

Your font fully covers the following languages that require the soft-dotted feature: Lithuanian (Latn, 2,357,094 speakers), Dutch (Latn, 31,709,104 speakers), Northern Tutchone (Latn, 85 speakers), Southern Tutchone (Latn, 65 speakers).

Your font does not cover the following languages that require the soft-dotted feature: Aghem (Latn, 38,843 speakers), Bafut (Latn, 158,146 speakers), Ikwere (Latn, 717,000 speakers), Bete-Bendi (Latn, 100,000 speakers), Yala (Latn, 200,000 speakers), Heiltsuk (Latn, 300 speakers), Dan (Latn, 1,099,244 speakers), Abua (Latn, 25,000 speakers), Western Krahn (Latn, 97,800 speakers), Kpelle, Guinea (Latn, 622,000 speakers), Koonzime (Latn, 40,000 speakers), Longto (Latn, 5,000 speakers), Ebira (Latn, 2,200,000 speakers), Dii (Latn, 71,000 speakers), Ejagham (Latn, 120,000 speakers), Navajo (Latn, 166,319 speakers), Basaa (Latn, 332,940 speakers), Ma’di (Latn, 584,000 speakers), Igbo (Latn, 27,823,640 speakers), Han (Latn, 6 speakers), Keliko (Latn, 63,000 speakers), Kaska (Latn, 125 speakers), Mfumte (Latn, 79,000 speakers), Mango (Latn, 77,000 speakers), Ekpeye (Latn, 226,000 speakers), Ukrainian (Cyrl, 29,273,587 speakers), Avokaya (Latn, 100,000 speakers), Kom (Latn, 360,685 speakers), Teke-Ebo (Latn, 260,000 speakers), Ngbaka (Latn, 1,020,000 speakers), Gulay (Latn, 250,478 speakers), Vute (Latn, 21,000 speakers), Lugbara (Latn, 2,200,000 speakers), Makaa (Latn, 221,000 speakers), South Central Banda (Latn, 244,000 speakers), Nzakara (Latn, 50,000 speakers), Zapotec (Latn, 490,000 speakers), Sar (Latn, 500,000 speakers), Nateni (Latn, 100,000 speakers), Mundani (Latn, 34,000 speakers), Fur (Latn, 1,230,163 speakers), Ijo, Southeast (Latn, 2,471,000 speakers), Cicipu (Latn, 44,000 speakers), Belarusian (Cyrl, 10,064,517 speakers), Southern Kisi (Latn, 360,000 speakers).

[code: soft-dotted]
⚠️ WARN Ensure fonts have ScriptLangTags declared on the 'meta' table.
  • ⚠️ WARN

    This font file does not have a 'meta' table.


    [code: lacks-meta-table]
ℹ️ INFO Font follows the family naming recommendations?
  • ℹ️ INFO

    Font does not follow some family naming recommendations:

Field Value Recommendation
Family Name Noto Serif Hentaigana ExtraLight exceeds max length (31)
[code: bad-entries]
ℹ️ INFO List all superfamily filepaths
  • ℹ️ INFO

    ofl/notoserifhentaigana


    [code: family-path]
ℹ️ INFO EPAR table present in font?
ℹ️ INFO Show hinting filesize impact.
  • ℹ️ INFO

    Hinting filesize impact:

ofl/notoserifhentaigana/NotoSerifHentaigana[wght].ttf
Dehinted Size 443.9kb
Hinted Size 444.0kb
Increase 24 bytes
Change 0.0 %
[code: size-impact]
ℹ️ INFO Font contains all required tables?
  • ℹ️ INFO

    This font contains the following optional tables:

- loca

- prep

- GPOS

- GSUB

- gasp

[code: optional-tables]

ℹ️ INFO METADATA.pb: Validate family.minisite_url field.
  • ℹ️ INFO

    Please consider adding a family.minisite_url entry.


    [code: lacks-minisite-url]
ℹ️ INFO Is the Grid-fitting and Scan-conversion Procedure ('gasp') table set to optimize rendering?
  • ℹ️ INFO

    These are the ppm ranges declared on the gasp table:

PPM <= 65535: flag = 0x0F - Use grid-fitting - Use grayscale rendering - Use gridfitting with ClearType symmetric smoothing - Use smoothing along multiple axes with ClearType®

[code: ranges]
ℹ️ INFO Font has old ttfautohint applied?
  • ℹ️ INFO

    Could not detect which version of ttfautohint was used in this font. It is typically specified as a comment in the font version entries of the 'name' table. Such font version strings are currently: ['Version 1.000']


    [code: version-not-detected]
[1] Family checks
ℹ️ INFO Check axis ordering on the STAT table.
  • ℹ️ INFO

    None of the fonts lack a STAT table.

And these are the most common STAT axis orderings:
('wght', 1)

[code: summary]

Summary

💥 ERROR ☠ FATAL 🔥 FAIL ⚠️ WARN ⏩ SKIP ℹ️ INFO ✅ PASS 🔎 DEBUG
0 0 0 10 40 9 184 0
0% 0% 0% 4% 16% 4% 76% 0%

Note: The following loglevels were omitted in this report:

  • SKIP
  • PASS
  • DEBUG

@nathan-williams nathan-williams merged commit 8b230b2 into main Dec 6, 2024
7 of 8 checks passed
@nathan-williams nathan-williams deleted the noto-serif-hentaigana branch December 6, 2024 00:05
@emmamarichal emmamarichal changed the title Update METADATA.pb Noto Serif Hentaigana METADATA.pb Dec 6, 2024
@emmamarichal
Copy link
Collaborator

emmamarichal commented Dec 13, 2024

@nathan-williams @simoncozens @aaronbell

(all the screenshots come from the dev-sandbox)

I tested Hentaigana: no more latin, but the new sample text: great! 🎉
But I just want to confirm some things with you:

1-

The two other fonts that contain "Hentaigana" in the name have a different sample text and look very different from the Noto version. The Noto string doesn't seem to be supported by Yuji family. But maybe there are several kind of Hentaigana? (it's a very naive question, I know nothing about japanese fonts).

Screenshot 2024-12-13 at 10 10 00 Screenshot 2024-12-13 at 10 19 47

2-

Noto Serif Hentaigana only appear in the font list when Japanese Hiragna is selected, but with tofu.

Screenshot 2024-12-13 at 10 26 51

@simoncozens
Copy link
Collaborator

OK, so background: Japanese has three writing systems - (Chinese) kanji, hiragana and katakana. Hiragana and katakana are phonetic (syllabic) alphabets, and their shapes are derived by simplifying a kanji character with the same sound. In modern Japanese, the kanji used to represent each sound is standarised, and so there is a standard way of writing each syllable. However, in Classical Japanese, people simpified different kanji to represent the same sound, and so there were many different ways of writing each syllable. These different, non-standard ways are called "hentaigana" ("kana alternates").

Yuji Hentaigana uses the normal Unicode codepoints for Hiragana, but instead of the modern shapes, they choose one "kana alternate" for each codepoint; it's kind of like Montserrat Alternates - a font full of stylistic alternates for normal characters. But which alternate did they choose? It's kind of up to the Yuji designers.

Some users want to make sure they are using a particular alternate form for each kana so that they can encode Classical Japanese manuscripts the way they were written. ("I need to write the sound 'ni' as a simplified 兒, not a simplified 仁".) So Unicode encoded a separate set of codepoints: instead of HIRAGANA LETTER NI, you get HENTAIGANA LETTER NI-1 DERIVED FROM U+4E39, HENTAIGANA LETTER NI-2 DERIVED FROM U+4E8C, etc.

This is why in your question, Noto Serif Hentaigana has a different sample text - it uses complete different Unicode codepoints. Actually the sample text repeats the same sounds but shows off different alternates ("I-1 RO-1 HA-1 NI-1, I-2 RO-2 HA-2, NI-2, ...")

Because Noto Serif Hentaigana is designed to be used with Noto CJK, it does not itself include any HIRAGANA codepoints. It just contains the hentaigana alternates. So actually this PR is wrong :-) and that is why you get tofu when you choose Japanese Hiranaga - it doesn't actually support those codepoints at all.

@emmamarichal
Copy link
Collaborator

Thank you very much @simoncozens for all these explanations! I'll save it for later ;)

@chrissimpkins
Copy link
Collaborator

Deleted my previous comment. It seems this tofu is actually WAI. This family does not support the Hiragana encodings, only the Hentaigana extension encodings if I understand Simon's comment correctly. Is the solution here to simply remove the Hiragana language definition so that the family does not show up with a Hiragana drop down filter?

@simoncozens
Copy link
Collaborator

Correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Sandbox
Development

Successfully merging this pull request may close these issues.

4 participants