Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add members for localization #1101

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open

Add members for localization #1101

wants to merge 25 commits into from

Conversation

christianliebel
Copy link
Member

@christianliebel christianliebel commented Oct 9, 2023

Closes #1077, closes #1078, closes #1080, closes #1085, closes #1087, closes #1086, closes #1084, closes #1088, closes #676

This change (choose at least one, delete ones that don't apply):

  • Adds new normative requirements

Implementation commitment (delete if not making normative changes):

If change is normative, and it adds or changes a member:

Commit message:

Add members for localization

Person merging, please make sure that commits are squashed with one of the following as a commit message prefix:

  • chore:
  • editorial:
  • BREAKING CHANGE:
  • And use none if it's a normative change

💥 Error: 502 Bad Gateway 💥

PR Preview failed to build. (Last tried on Sep 5, 2024, 12:37 PM UTC).

More

PR Preview relies on a number of web services to run. There seems to be an issue with the following one:

🚨 Spec Generator - Spec Generator is the web service used to build specs that rely on ReSpec.

🔗 [Related URL]([object Object])

error code: 502

If you don't have enough information above to solve the error by yourself (or to understand to which web service the error is related to, if any), please file an issue.

@aarongustafson
Copy link
Collaborator

I'd prefer adding the "_localized" suffix across the board as it makes it explicit what the member's purpose is.

@marcoscaceres
Copy link
Member

marcoscaceres commented Nov 3, 2023

Yeah, I guess it does make sense to make it "_localized" as this can only be used for that.

@dmurph
Copy link
Collaborator

dmurph commented Nov 3, 2023

From editors meeting:

  • We want to also localize the icons, but because it already has an s that makes it a little weird
  • Due to that, let's do *_localized for all of these fields.
  • icons_localized won't have any triplets, we'll just resuse the parsing algorithm for icons (same json structure). But, like the others, it will be lang-string dictionary.
  • These can apply as well within the shortcuts item, to be able to localize those.

TPAC discussion here

@marcoscaceres
Copy link
Member

Instead of defining new members, let's instead define *_localized member pattern that is either a "text localizable member" or an "image-resource localizable member" (i.e., icons or for shortcuts). That way we don't need to define the algorithms over and over again.

@marcoscaceres
Copy link
Member

We already have defined localizable members so we can already reuse that.

@marcoscaceres
Copy link
Member

marcoscaceres commented Nov 3, 2023

Here is what an shortcuts member might look like with some _localized members sprinkled in:

{
  "shortcuts": [
    {
      "name": "Play Later",
      "name_localized": {
        "fr": "Écouter plus tard"
      },
      "description": "View the list of podcasts you saved for later",
      "description_localized": {
        "fr": { "lang": "en", "dir": "ltr", "value": "English description because that's part of our brand." },
      },
      "url": "/play-later",
      "icons": [
        {
          "src": "/icons/play-later.svg",
          "type": "image/svg+xml"
        }
      ],
      "icons_localized": {
        "fr": [
          {
            "src": "/icons/fr/play-later.svg",
            "type": "image/svg+xml"
          }
        ]
      } 
    },
    {
      "name": "Subscriptions",
      "description": "View the list of podcasts you listen to",
      "description_localized": {
        "fr": "Consultez la liste des podcasts que vous écoutez."
      },
      "url": "/subscriptions?sort=desc"
    }
  ]
}

Note: updated to include a triple example.

@marcoscaceres
Copy link
Member

Question is if we should allow localizing various URLs... that gets a bit messy in places, but could be doable.

@christianliebel
Copy link
Member Author

@marcoscaceres @dmurph Thanks. Let's also localize urls. I'll resume working on this soon.

@mgiuca
Copy link
Collaborator

mgiuca commented Nov 6, 2023

LTTP... thanks for working on this Christian!

Let's also localize urls. I'll resume working on this soon.

Wait, why are we localizing URLs? That seems undesirable to me. It means that shortcuts can go to different places depending on the language, and we need to update functionality that may be cached (that isn't just string/icon data). This may add non-trivial implementation complexity (it would require a deeper analysis to understand to what degree).

Would the use case for this be that you can have URLs with different ?lang= query parameters to match the user's language? I think I'd prefer just letting the website use the Accept-Language HTTP feature for this.

@christianliebel
Copy link
Member Author

Wait, why are we localizing URLs? That seems undesirable to me.

Maybe we can also loop in @aphillips to see whether it makes sense to localize URLs.

@aphillips
Copy link
Contributor

Icons, graphics, or remote content (help pages, for example) are sometimes varied by locale or by region (with locale serving as a poor proxy for region). This might be done, for example, because the icon contains some text or because a graphic shows a culturally-linked image (personal images, national costume, post box shapes, etc. etc.) that the user wishes to localize. Or it might be because functionality or defaults differ (sorting based on pronunciation instead of name for Chinese, for example)

We don't know why the user might want to localize the icon or shortcut (or whatever).

I agree that this can be abused and there might be reasons not to allow some fields to be localized, although I'd probably thinks about health warnings first?

@aarongustafson
Copy link
Collaborator

I think I'd prefer just letting the website use the Accept-Language HTTP feature for this.

Probably the most elegant, but it’s not within the realm of possibility for a lot of orgs and site types (thinking static sites, for example).

Would the use case for this be that you can have URLs with different ?lang= query parameters to match the user's language?

See also MDN style links where the language code is embedded in the URL path.

Icons, graphics, or remote content (help pages, for example) are sometimes varied by locale or by region (with locale serving as a poor proxy for region). This might be done, for example, because the icon contains some text or because a graphic shows a culturally-linked image (personal images, national costume, post box shapes, etc. etc.) that the user wishes to localize. Or it might be because functionality or defaults differ (sorting based on pronunciation instead of name for Chinese, for example)

We don't know why the user might want to localize the icon or shortcut (or whatever).

Agree on all of these.

@benfrancis
Copy link
Member

benfrancis commented Nov 14, 2023

Just one data point, but as a past precedent the similar Web of Things (WoT) Thing Description specification landed on titles and descriptions members of Thing (each of type MultiLanguage) in addition to title and description for this use case. This is the case for both Thing Description 1.0 (W3C Recommendation) and Thing Description 1.1 (W3C Proposed Recommendation).

I personally don't like that solution because I prefer using HTTP content negotiation with an Accept-Language header as per the suggestion in the current Web Application Manifest Working Draft, rather than creating an extremely verbose manifest with (theoretically) up to thousands of different languages. However, as has been pointed out it's not always possible to use content negotiation (e.g. on static site hosting like GitHub Pages). The Thing Description specification therefore offers both as alternative approaches.

If consistency between W3C specifications is considered important, then names, short_names and descriptions would make sense. That doesn't work for icons, but that member is already different because it's an array of ImageResources that the user agent can select from. Language could potentially just be another criteria for selecting an image.

I note that in HTML the <link> element has a hreflang attribute, so presumably <link rel="icon" href="/icons/fr/play-later.svg" hreflang="fr"> is valid (though likely currently ignored by user agents). For the (slightly unusual) case of localising app icons, an equivalent might be to add a lang member to ImageResource.

Example:

{
  "lang": "en",
  "dir": "ltr",
  "name": "Super Racer 3000",
  "names": {
    "fr-FR": "Super Coureur 3000",
    "es-ES": "Súper Corredor 3000"
  },
  "short_name": "Racer3K",
  "short_names": {
    "fr-FR": "Coureur3K",
    "es-ES": "Corredor3K"
  },
  "icons": [
    {
      "src": "icon.png",
      "sizes": "64x64",
      "type": "image/png"
    },
    {
      "src": "icon-fr.png",
      "sizes": "64x64",
      "type": "image/png",
      "lang": "fr-FR"
    },
    {
      "src": "icon-es.png",
      "sizes": "64x64",
      "type": "image/png",
      "lang": "es-ES"
    }
  ],
  "scope": "/",
  "id": "superracer",
  "start_url": "/start.html",
  "display": "fullscreen",
  "orientation": "landscape",
  "theme_color": "aliceblue",
  "background_color": "red"
}

Note that ImageResource also has a label member which could then also be localised this way, if a localised accessible description of the icon is needed!

One question: How does dir interact with the localised members? Can it safely be derived from language? The Thing Description specification has a lot to say on that topic, which I can't say I fully understand, but they use the Strings on the Web: Language and Direction Metadata W3C Note for guidance.

Hope this helps.

@mgiuca
Copy link
Collaborator

mgiuca commented Nov 16, 2023

Perhaps we should clarify what "localize URLs" means.

Are we talking about:

  • Localizing all fields that are URLs? (e.g. start_url, scope, potential future ones like the home scope of tabbed apps).
  • Localizing all fields called "url", like the ones in icons and shortcuts?
  • Something else?

I think we generally agree that we need to localize icons, but we're doing that at the icons level, not the url within icons (i.e. icons_localized with a local version of each icon dict, not icons with the icon dict including a url_localized. Allowing both of these creates two ways to do it which isn't ideal.)

Maybe it makes sense for URLs like shortcuts to be able to change based on language, but really the point of this initiative (I thought) was to be able to localize your app's metadata that gets displayed at the OS level, like name and icon, not to solve the problem of localizing all the content in the app. (If you can't configure your server to serve content based on headers, you can still make your service worker return localized content based on Accept-Language.) An app that relies purely on the manifest URLs to display content in the user's language is likely to be quite brittle.

I think we could run into major headaches if we allow scope to be localized. And by extension, start_url. (e.g. scope must be a superset of start_url - what happens if that's true in some languages but not others?)

I prefer if we just start with name, short_name, description and icons and go from there as needed.

How does dir interact with the localised members? Can it safely be derived from language?

This was discussed at TPAC (search for "dir") - unfortunately when I asked this question, the answer was not recorded ("?"). From memory, @aphillips pointed out that we may not know the language's direction because languages are not set in stone. I think the overwhelmingly common case will be a known language, which means we should be able to derive dir from lang in all practical cases (and probably default to ltr if we don't recognize the language - the overwhelming majority of languages are LTR). We should have the dir member for being explicit, but I think in 99.9% of cases the site should not need to specify dir as it can be derived from language.

@marcoscaceres
Copy link
Member

Supportive of what @mgiuca said above... let's start small and go from there (and definitely let's not have multiple ways of doing the same thing, specially with URLs). And yes, let's keep the localizable members restricted to a small set (including image objects, not members within those objects).

@aphillips
Copy link
Contributor

@mgiuca noted:

From memory, @aphillips pointed out that we may not know the language's direction because languages are not set in stone. I think the overwhelmingly common case will be a known language, which means we should be able to derive dir from lang in all practical cases (and probably default to ltr if we don't recognize the language - the overwhelming majority of languages are LTR). We should have the dir member for being explicit, but I think in 99.9% of cases the site should not need to specify dir as it can be derived from language.

Your specification should not derive direction from language unless there is no other alternative. This is not because languages are mutuable.

You may permit an item that lacks separate direction metadata to attempt to use the language to estimate the direction or to act as a hint, but this should not be the default way of doing it. In fact, I18N recommends making the direction auto when the dir is not present at the item or document-default level instead of using directional estimation based on language. We explicitly recommend using auto instead of ltr as the default, since an unlabeled string that starts with a strongly RTL character is probably trying to tell you something 😉.

We have extensive guidance in https://www.w3.org/TR/string-meta/ and we're working on an update to our guidance about manifests here (I hope to land this PR on Thursday) which you may find useful here.

@dmurph dmurph mentioned this pull request May 2, 2024
@dmurph
Copy link
Collaborator

dmurph commented May 2, 2024

Manifest Working session notes:

This seems to be the concluded format (copied from Marcos's comment above) with triple examples:

{
  ...
  "dir": "ltr",
  "lang": "en-x-marcos",
  ...
  "shortcuts": [
    {
      "name": "Play Later",
      "name_localized": {
        "fr": "Écouter plus tard"
      },
      "description": "View the list of podcasts you saved for later",
      "description_localized": {
         "en":  { "value": "My App, hey!", "dir": "ltr", "lang": "fr"},
         "en-GB": { "value": "My App, eh wut?", "dir": "ltr"},
         "fr": "string",
         "ar": { "value": "...", "dir": "rtl" }
      },
      "url": "/play-later",
      "icons": [
        {
          "src": "/icons/play-later.svg",
          "type": "image/svg+xml"
        }
      ],
      "icons_localized": {
        "fr": [
          {
            "src": "/icons/fr/play-later.svg",
            "type": "image/svg+xml"
          }
        ]
      } 
    },
    {
      "name": "Subscriptions",
      "description": "View the list of podcasts you listen to",
      "description_localized": {
        "fr": "Consultez la liste des podcasts que vous écoutez."
      },
      "url": "/subscriptions?sort=desc"
    }
  ],
   ...
}

TPAC notes from this are here

@tomayac
Copy link
Contributor

tomayac commented May 3, 2024

I suppose the stray "lang": "fr" in your comment before isn't intended:

"en":  { "value": "My App, hey!", "dir": "ltr", "lang": "fr"},

I could edit your comment directly, but wanted to make sure it's indeed a copy/paste first.

@dmurph
Copy link
Collaborator

dmurph commented Jun 6, 2024

I suppose the stray "lang": "fr" in your comment before isn't intended:

"en":  { "value": "My App, hey!", "dir": "ltr", "lang": "fr"},

I could edit your comment directly, but wanted to make sure it's indeed a copy/paste first.

This actually is intended - the example might not be great, but we need the ability for a string to be displayed in a different language A when showing for language B. For example, if a product name or logo etc was a character in a different language, the dev can make sure it renders in the desired language for the user's chosen display language.

@marcoscaceres
Copy link
Member

@christianliebel can you check #676 ... and update the description of this issue as closing that bug, as well as any other bugs this will close?

index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
christianliebel added a commit to christianliebel/manifest-incubations that referenced this pull request Jun 8, 2024
@calidion
Copy link

calidion commented Jun 25, 2024

hello, everyone

I would suggest that localization be a native feature for all strings.

Hence we don't need extra fields for localization.

just to enhance the parser to parse new strings.

My suggestion is that strings can divided into two types:

  1. the normal primitive strings defined by different charsets
 "Hello world!"
or
 "你好世界!"
  1. the enhanced strings which include i18n features and be in json format, like this:
lang: "en",   // fallback language if browser meets no locale strings listed
"name":  {
  "en":  "Web App",    // fallback for all en-* browsers or for general  en-* users
  "en-US":  "Web App",   // specific string for localized en
  "zh": "网站应用”, // fallback for all zh-* browsers  or for general  zh-* users
  "zh-CN": "网站应用”, // specific string for localized zh
  ...
}

index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated
Comment on lines 1339 to 1341
from the user's set language. For example, this helps ensure that
an application name is correctly pronounced by assistive
technology, even if it is in a foreign language.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from the user's set language. For example, this helps ensure that
an application name is correctly pronounced by assistive
technology, even if it is in a foreign language.
from the user's set language. This helps ensure that
an application name is correctly pronounced by assistive
technology, even if it is in a language foreign to the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent of why this is great is good, but I think we need to rewrite this. Let's discuss in our next call. In particular, it would be great to rewrite this so "correctly pronounced" is at the start of the note. We should also provide an actual example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an example and changed the text a bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcoscaceres Here's the new version.

index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
@@ -2298,7 +2556,9 @@ <h3>
</h3>
<p>
The <dfn>application's name</dfn> is derived from either the
[=manifest/name=] member or [=manifest/short_name=] member.
[=manifest/name=] member or [=manifest/short_name=] member. The user
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs more clarification as to which one wins.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Chromium we use them depending on UX needs. It's not implemented, but I would like to follow similar rules for extensions - short_name is truncated to 12 characters and name is truncated to 75.

This sets expectations for devs, and allows the user agent to show an app name where there might not be much space.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we discuss that in a separate issue? It used to be like this and is not directly related to l10n.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How they are treated by the OS is implementation specific. I don't think we need to say anything here.

index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
christianliebel and others added 2 commits July 11, 2024 07:29
Co-authored-by: Marcos Cáceres <marcos@marcosc.com>
Co-authored-by: Marcos Cáceres <marcos@marcosc.com>
@calidion
Copy link

I don't think put all languages into one file is a good idea.
And I never see compatibility work in web.
Almost all new created web pages can not be loaded in old browsers.
backward compatibility is meaningless in most cases.

@calidion
Copy link

appendix _localized definitely is not a good choice for localization for a web application or a web site.
There should be a general localization schema for both a web application and/or a web site that can easily introduce multi-language support.
and reduce the current burden carried by most backend web servers and frontend libraries/frameworks.

@christianliebel
Copy link
Member Author

There should be a general localization schema for both a web application and/or a web site that can easily introduce multi-language support.

@calidion While a general localization schema for web applications and websites may be beneficial, this PR is solely focused on adding localization capabilities specifically to the Web Application Manifest rather than solving broader multi-language support for web applications or websites. I think the WICG Proposals repo would be the right spot to propose and discuss a more general solution.

@calidion
Copy link

@christianliebel
Thanks for the information.

But still I hope this feature can be hold for a while before vast agreement would be made.

Copy link
Collaborator

@dmurph dmurph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM, one suggestion to change the example.

</p>
<aside class="example" title="Localizing the application name">
<pre class="json">
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to make it obvious why one might want to have a different 'lang' for an entry in here, perhaps we should use a company that so it would have opinions about name pronunciations in other languages.

L'Occitane is a good example - using wikipedia you can see that sometimes is uses the french version, while other times it has a translated name. For cases like English & German, screen readers should read the name in a french pronunciation.

Suggested change
{
{
"lang": "fr",
"dir": "ltr",
"name": "L'Occitane",
"name_localized": {
"en": { value: "L'Occitane", "lang": "fr" },
"de": { value: "L'Occitane", "lang": "fr" },
"zh": "歐舒丹"
"en-GB": {"value": "L'Occitane en Provence", "lang": "fr" },
"fr": "L'Occitane",
"ar": {value: "لوکسیتان", "dir": "rtl"}
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's okay to use real-world brands/examples, that's fine with me. I also have the "Just Eat" example here, where the brand even differs between de-DE and de-CH: #1101 (comment)

@marcoscaceres WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, but don't use a real company name... they might not approve.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example below seems fine to me... @dmurph?

@@ -2298,7 +2556,9 @@ <h3>
</h3>
<p>
The <dfn>application's name</dfn> is derived from either the
[=manifest/name=] member or [=manifest/short_name=] member.
[=manifest/name=] member or [=manifest/short_name=] member. The user
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Chromium we use them depending on UX needs. It's not implemented, but I would like to follow similar rules for extensions - short_name is truncated to 12 characters and name is truncated to 75.

This sets expectations for devs, and allows the user agent to show an app name where there might not be much space.

index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
Copy link
Member

@marcoscaceres marcoscaceres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok to me.

Add the trimming any text strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment