Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: AOT compilation with icu-to-json (experiment) #705

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

amannn
Copy link
Owner

@amannn amannn commented Dec 6, 2023

Fixes #962

(WIP)

This is an experiment to use icu-to-json for AOT compilation of messages on the server side / at build time and to reduce the bundle size in Client Components.

Current progress: For now, both compilation as well as evaluation happens in the same module (and therefore on the client side). This is mostly intended to evaluate initial compatibility. Later, we could move the compilation step to the server.

Questions:

  • How to support t.raw or t.markup? We need to decide if a message needs to be parsed before it's used. Do we need to analyze the usages? Or could we revert the parsing when t.raw is called?

TODO

Copy link

vercel bot commented Dec 6, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
next-intl-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 11, 2023 9:04pm
next-intl-example-app-router ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 11, 2023 9:04pm

@@ -1,9 +1,7 @@
import {render, screen} from '@testing-library/react';
import {parseISO} from 'date-fns';
// eslint-disable-next-line import/no-named-as-default -- False positive
Copy link
Owner Author

@amannn amannn Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Currently skipped tests:

  • Error handling (will be moved to a different place)
  • Nested rich text
image

@@ -412,7 +412,8 @@ describe('t.rich', () => {
);
});

it('handles nested rich text', () => {
// TODO: icu-to-json doesn't seem to handle this currently
it.skip('handles nested rich text', () => {
const {container} = renderRichTextMessage(
'This is <bold><italic>very</italic> important</bold>',
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be addressed on the icu-to-json side I think.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this "should" work 😄

you might have to define a tag and/or a baseTag formatter

| string
| {
type: number; // TODO: Unused, is this necessary?
tokens: Array<unknown>; // TODO: Unused, is this necessary?
Copy link
Owner Author

@amannn amannn Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these two properties be removed from the parsed result? Or should they be used by the consumer?

Copy link

@jantimon jantimon Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean - Formatters are called here with (value, locale, formatOptions)

https://github.com/jantimon/icu-to-json/blob/83f24322bfd4ab66402e70d9727a650720501925/src/runtime.ts#L150-L152

the format options are passed through from MessageFormat

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean e.g. these cases:

// [["price",4,"numberFmt",{"type":0,"tokens":[{"stem":"currency","options":["EUR"]}],"parsedOptions":{"style":"currency","currency":"EUR"}}]]
'{price, number, ::currency/EUR}'

// [["value",4,"numberFmt",{"type":0,"tokens":[{"stem":".#","options":[]}],"parsedOptions":{"maximumFractionDigits":1}}]]
'{value, number, ::.#}'

Is tokens necessary here? In regard to minimizing the size of the AST, could this be simplified to something like this?

[["value",4,"numberFmt",{"maximumFractionDigits":1}]]

Similarly, date skeletons seem to be a bit verbose:

// [["now",4,"date",{"type":1,"pattern":"yyyyMdHm","parsedOptions":{"year":"numeric","month":"numeric","day":"numeric","hourCycle":"h23","hour":"numeric","minute":"numeric"}}]]
'{now, date, ::yyyyMdHm}'

Could the parsedOptions be unwrapped and replace the object they're placed within?

type MessageFormat = Omit<
ReturnType<typeof compile>,
// TODO: Do we need the args?
'args'
Copy link
Owner Author

@amannn amannn Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be removed from the parsed result? See also https://github.com/amannn/next-intl/pull/705/files#r1418720864

Copy link

@jantimon jantimon Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use compileToJson instead it's the same as compile just without args

args are helpful to generate types or to optimize which formatters should be included to a bundle.
e.g. if date is used you might want to add a date formatter

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks!

@amannn amannn changed the title experiment: Try out icu-to-json feat: Try out icu-to-json (experiment) Dec 7, 2023
const evaluated = evaluateAst(
messageFormat.json,
locale,
allValues,
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I pass the values from the user or args from the parsed result here?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the args are not needed for formatters however the args can tell you during compile time which formatters need to be available to run the code.

in your case there is a very special formatter wich you should add for full react support

it allows processing the children before they are passed to the given tag function:

{
   tag: (children: Array<string | ReactNode>, locale: string) => 
         children.map((value, i) => typeof value === "string"
            ? value 
            : <Fragment key={`f-${i}`}>{value}</Fragment>
}

e.g. for a translation like <foo>Hello <b>{name}</b></foo>

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is also the baseTag formatter which is a fallback which kicks in if the end-user did not define a tag.

by default the tag is converted to a string but you might also wrap in a fragment or even allow some tags to be used as html:

// for a compiled message "<b>Hello {name}</b>" 
const formatters = {
  baseTag: (Tag, chilren) => {
      // allowlist:
      if (["b", "strong", "p"].include(Tag)) { 
        return <Tag>{children}</Tag>
      }
      return <>{children}</>;
  }
};
run(message, lang, { name: "Joe" }, formatters)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried updating your example here to use nested rich text: https://codesandbox.io/p/sandbox/icu-to-json-demo-forked-2jzngr?file=%2Fsrc%2Ficu.tsx%3A19%2C1-20%2C1

Somehow the inner part is missing. Am I doing something wrong?

// to be able to format to parts.
prepareTranslationValues({...defaultTranslationValues, ...values})
const allValues = {...defaultTranslationValues, ...values};
// TODO: The return type seems to be a bit off, not sure if
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two different ways to run the precompiled icu messages:

  1. run
  2. evaluateAst

the idea is that run will always return a string and evaluateAst will always return an array.

const t = (key, args) => run(messages[key], lang, args);

icu-to-json is framework independent so it would require you to wrap a react Fragment around the result:

t.rich = (key, args) => <>{evaluateAst(messages[key], lang, args)}</>;

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that helps with the fragment—thanks!

I was wondering about this case:

Screenshot 2023-12-11 at 21 24 31

It seems like the args determine the return value. However, the function b has already been called at this point (having returned a React.JSX.Element) and date should be turned into a string.

Therefore the returned value should have the type (string | React.JSX.Element)[], right?

Or am I missing something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ahead-of-time compilation of messages
2 participants