Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Context + hash based message id #1360

Closed
timofei-iatsenko opened this issue Jan 23, 2023 · 6 comments · Fixed by #1440 or lingui/swc-plugin#19
Closed

RFC: Context + hash based message id #1360

timofei-iatsenko opened this issue Jan 23, 2023 · 6 comments · Fixed by #1440 or lingui/swc-plugin#19
Assignees
Milestone

Comments

@timofei-iatsenko
Copy link
Collaborator

Current lingui implementation has a flaw: you have your original message in the bundle at least 2 times + 1 translation.
For a line "Hello world" you would have one in the source code as id in i18n call:

const message = i18n._('Hello world!');

Another one as key in the message catalog and then a translation itself:

export const messages = {
 'Hello world': 'Cześć swiat!'
}

Strings could be very long, not just a couple words, so this may bring more kb to the bundle.

If you use context for the string it would be even more kilobytes because for each msg which should have a context you would have context explicitly written in the bundle + in the msg catalog.

const message = t({
   context: 'My context',
   message: 'Hello world!'
})

export const messages = {
  'My context':  {
     'Hello world': 'Cześć swiat!'
  }
}

A much better option would be generating a "stable" ID based on the msg + context as hash with fixed length.

So internal structure would look like

{
 [<hash>]: 'Cześć swiat!'
}

And lookups in runtime would be much simpler.

Hash would be calculated at build time by macros. So macros instead of

const message = t({
   context: 'My context',
   message: `Hello`
})

// ↓ ↓ ↓ ↓ ↓ ↓

import { i18n } from "@lingui/core"
const message = i18n._(/*i18n*/{
   context: 'My context',
   id: `Hello`
})

Would generate:

import { i18n } from "@lingui/core"
const message = i18n._(/*i18n*/{
   id: "<hash(message + context)>",
   message: `Hello`, // only for development purpose, would be dropped in production
})

The context feature would affect ONLY messageid generation, and thus would only be supported when using macro. For the non-macro users using context has no sense because they provide a messageid manually.

Default message and context would be stripped from runtime code and replaced to the ID generated in compile time.

@timofei-iatsenko timofei-iatsenko mentioned this issue Jan 23, 2023
8 tasks
@andrii-bodnar
Copy link
Contributor

@thekip if I understand correctly, these hashes will appear in the localization files like PO or JSON as keys, right?

@timofei-iatsenko
Copy link
Collaborator Author

In po files everything would be the same, because po files support context natively. For json formats it is correct.

That's exactly as it works in other most popular libraries, such as https://formatjs.io/docs/getting-started/message-extraction

@timofei-iatsenko
Copy link
Collaborator Author

Interesting fact, Tomáš Ehrlich, original author of the library, mentioned this issue and solution proposed in this RFC in his talk at React Conf 2018, but looks like this was never implemented.

@tricoder42
Copy link
Contributor

You're right, I never get to implementing this idea. It seems possible:

  1. extract messages from source code and translate
  2. compile message catalog - this will "minify" message Ids
  3. build code for production — this time, macros use minified ids

At the time I was even thinking about using js variables instead of object keys. That way the minification is handled by bundler plus it might even remove unused messages with dead code elimination, but I guess that too far stretched.

@timofei-iatsenko
Copy link
Collaborator Author

I think generating a hash based i with fixed (short) length would be more than enough. All libraries which involve compilation into process uses the same approach (formatjs, angular's localize)

@JSteunou
Copy link
Contributor

JSteunou commented Feb 6, 2023

Yes double yes, I'm so excited by this feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment