Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement <fbt:list>. #18

Merged
merged 1 commit into from
Dec 23, 2024
Merged

Implement <fbt:list>. #18

merged 1 commit into from
Dec 23, 2024

Conversation

cpojer
Copy link
Collaborator

@cpojer cpojer commented Dec 23, 2024

Summary

intlList is an incredibly useful function to concatenate a list of items using locale specific conjunctions and delimiters, for example to create a string like "Available Locations: Tokyo, London and San Francisco" that can be translated to any language. It unfortunately never worked in open source through fbt. In Athena Crisis I used a typed "fork", see intlList.

Why did it not work previously?

The intlList file was compiled using the fbt babel plugin but was not actually included in the compiled dist/index.js file of the package. It could be included from source (without any types), however since the module was compiled during build time and shipped like that to consumers of the package, there was no way to extract the translatable scripts in downstream projects that were using fbt. intlList could therefore be used, but not translated. There are various issues related to intlList in the now archived repo.

This was likely not prioritized/noticed by the past Facebook team that used to maintain fbt because Facebook uses a monorepo for all their code, including fbt, and the collector runs on the source version of fbt, including intlList. The reason why intlList didn't work in fbtee before this Pull Request was because of the above reasons including that the new build pipeline didn't apply the fbtee babel plugin on the fbtee runtime code.

What does this PR do?

  • Fixes the above by compiling list.tsx using the fbtee babel plugin.
  • Automatically adds the default translation strings from the fbtee runtime package into downstream projects so they can be translated.
  • Exports a list function and a List React component, and introduces a new fbt:list construct.
  • Add the correct types.

Also in this PR (sorry)

  • Various improvements to error messages.
  • Small code cleanups.

Usage

Standalone (outside of <fbt>) with React:

<List
  conjunction="or"
  items={[
    <fbt desc="Item in a list." key="photo">
      photo
    </fbt>,
    <fbt desc="Item in a list." key="photo">
      link
    </fbt>,
    <fbt desc="Item in a list." key="video">
      video
    </fbt>,
  ]}
/>

Usage with <fbt>:

<fbt desc="Lists">
  Available Locations:{' '}
  <fbt:list items={['Tokyo', 'London', 'Vienna']} name="locations" />.
</fbt>
/>

Why add fbt:list?

It would have been enough to just fix the above mentioned issues without adding fbt:list, but intlList is actually cumbersome to use with fbt because each call has to be wrapped using <fbt:param name="…">, for example:

<fbt desc="List of units">
  <fbt:param name="units">
    {intlList(
      units.map((unit, index) => (
        <RawUnitName color={color} key={index} unit={unit} />
      )),
      Conjunctions.AND,
      Delimiters.COMMA,
    )}
  </fbt:param>{' '}
  units
</fbt>

See Athena Crisis' SkillDescription.tsx.

With the new <fbt:list> construct we can simplify the above example like this:

<fbt desc="List of units">
  <fbt:list
    items={units.map((unit, index) => (
      <RawUnitName color={color} key={index} unit={unit} />
    ))}
    name="units"
  />{' '}
  units
</fbt>

The new version is 6 lines instead of 9 lines of code, does not require passing the most common combination of conjunctions and delimiters, and is overall just a lot less heavy.

(cc @yungsters – it might be worth porting this back to Facebook's fbt)

Build Changes

The babel-plugin-fbtee collection script received a new --include-default-strings option which is by default true and copies the translatable strings from list.tsx into the user's project.

This PR changes how the fbtee package is compiled:

  1. It applies the fbtee transform on the source and copies it from src to lib-tmp with TypeScript in-tact.
  2. A few minor fixes are applied on lib-tmp to make tsup work correctly.
  3. Run tsup just like before.
  4. The collection script is run to extract the "default-strings" from the fbtee package, which will be shipped to npm at /Strings.json so babel-plugin-fbtee can copy them into the user project. The collection script is invoked with --include-default-strings=false because it is literally for extracting those default-strings.

@cpojer cpojer mentioned this pull request Dec 23, 2024
15 tasks
import { test } from '@jest/globals';
import { CollectFbtOutput } from '../../../../packages/babel-plugin-fbtee/src/bin/collect.tsx';

test('fbtee strings are included in the collected strings of the example project', () => {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test verifies that the strings from fbtee are correctly included in the example project via the babel-plugin-fbtee collection script.

Comment on lines +130 to +132
if (optionName === 'key') {
return optionName;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allows usage of key on fbt, like if you are using fbt in an array.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we start tracking changes / new features in a CHANGELOG file to facilitate assembling release notes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, yeah.

@@ -1,4 +1,4 @@
import { describe, expect, it, jest } from '@jest/globals';
import { describe, expect, it } from '@jest/globals';
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just cleaned up this file.

Comment on lines +124 to +134
export function List({
conjunction,
delimiter,
items,
}: {
conjunction?: Conjunction;
delimiter?: Delimiter;
items: Array<string | React.ReactElement | null | undefined>;
}) {
return list(items, conjunction, delimiter);
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably this module could be changed to only export List and remove list. However, I want to keep a distinction between mostly strings (list) and React component (List) usage so that the migration to real React components will be easier in the future.

Comment on lines +5 to +8
// Ensure the local version of `fbs` is used instead of auto-importing `fbtee`.
// eslint-disable-next-line no-unused-expressions, @typescript-eslint/no-unused-expressions
fbs;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was necessary because this module uses <fbs>, which was also not compiled previously. However, it is currently not actually part of the distributed files anyway.

(We may remove this module depending on what we decide to do with Number formatting)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you shed more light on the problem you're trying to avoid here? Not very familiar with the new packaging system yet.

From my understanding, importing fbs.tsx will end up bringing fbt.tsx and pretty much the full fbtee module anyway since the babel-plugin-fbtee-auto-import transform would auto-import fbtee by default.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have an auto import plugin which will import fbtee, which always resolves to lib/index.js (the build output). That means we will end up with two copies of the library in tests since it will load the source and then auto-import the build, therefore breaking everything. This ensures that the local version is imported, and the import is not stripped since the binding is "used".

Comment on lines 1 to 12
#! /usr/bin/env node --experimental-strip-types --no-warnings
import { readFileSync, writeFileSync } from 'node:fs';
import { join } from 'node:path';

const fileName = join(import.meta.dirname, '../Strings.json');
const strings = JSON.parse(readFileSync(fileName, 'utf8'));

for (const key in strings.phrases) {
strings.phrases[key].filepath = 'node_modules/fbtee/Strings.json';
}

writeFileSync(fileName, JSON.stringify(strings, null, 2), 'utf8');
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strings are exported with their source filepath which doesn't exist in a downstream project. This replaces the file paths with the actual location node_modules/fbtee/lib/index.js.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add this comment to the source code plz :-)

"build": "pnpm build:babel && pnpm build:prepend && pnpm build:fbtee-strings && tsup lib-tmp/index.tsx -d lib --target=node22 --format=esm --clean --no-splitting --dts",
"build:babel": "babel --delete-dir-on-start --copy-files --config-file ./babel-build.config.js --out-dir=lib-tmp --extensions=.tsx --keep-file-extension --ignore='src/**/__tests__/*.tsx' src",
"build:fbtee-strings": "pnpm fbtee manifest && pnpm fbtee collect --pretty --include-default-strings=false --manifest < .src_manifest.json > Strings.json $(find src -type f \\! -path '*/__tests__/*' \\! -path '*/__mocks__/*') && ./scripts/rewrite-filepaths.ts",
"build:prepend": "node -e \"const file = './lib-tmp/list.tsx'; fs.writeFileSync(file, '// @ts-nocheck\\n' + fs.readFileSync(file, 'utf8'));\""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we compile list.tsx using babel-plugin-fbtee, the types are not entirely correct because of how we handle React components. Since we know the source types are correct, we can just disable TypeScript in that one file so that the dts generator keeps working.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what types aren't correct without this workaround?

For future reference, this kind of comment should probably be embedded into the source code itself - even in JSON files.

It'd be great if you could add it like follow:

"//build:prepend": "comments...",
"build:prepend": "...",

"build": "tsup src/index.tsx -d lib --target=node22 --format=esm --clean --no-splitting --dts"
"build": "pnpm build:babel && pnpm build:prepend && pnpm build:fbtee-strings && tsup lib-tmp/index.tsx -d lib --target=node22 --format=esm --clean --no-splitting --dts",
"build:babel": "babel --delete-dir-on-start --copy-files --config-file ./babel-build.config.js --out-dir=lib-tmp --extensions=.tsx --keep-file-extension --ignore='src/**/__tests__/*.tsx' src",
"build:fbtee-strings": "pnpm fbtee manifest && pnpm fbtee collect --pretty --include-default-strings=false --manifest < .src_manifest.json > Strings.json $(find src -type f \\! -path '*/__tests__/*' \\! -path '*/__mocks__/*') && ./scripts/rewrite-filepaths.ts",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what generates the default strings that will be included in downstream projects.

[childIndex: number]: number;
};

type ChildToParentMap = Map<number, number>;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was one of the cleanups that stops converting numbers to strings and back for the collection code.

Comment on lines -83 to -97
const args = {
COMMON_STRINGS: 'fbt-common-path',
CUSTOM_COLLECTOR: 'custom-collector',
GEN_FBT_NODES: 'gen-fbt-nodes',
GEN_OUTER_TOKEN_NAME: 'gen-outer-token-name',
HASH: 'hash-module',
HELP: 'h',
MANIFEST: 'manifest',
OPTIONS: 'options',
PACKAGER: 'packager',
PLUGINS: 'plugins',
PRESETS: 'presets',
PRETTY: 'pretty',
TRANSFORM: 'transform',
} as const;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuff is not needed since yargs is now type-safe with TypeScript.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this on my local and it doesn't seem that TS detects potential typos; which is what the previous implementation was aiming to avoid.

E.g.

diff --git a/packages/babel-plugin-fbtee/src/bin/collect.tsx b/packages/babel-plugin-fbtee/src/bin/collect.tsx
index 735d5ab1..bac57a0a 100644
--- a/packages/babel-plugin-fbtee/src/bin/collect.tsx
+++ b/packages/babel-plugin-fbtee/src/bin/collect.tsx
@@ -112,8 +112,8 @@ const argv = y
       'This is a map from {[text]: [description]}.',
   )
   .boolean('pretty')
-  .default('pretty', false)
-  .describe('pretty', 'Pretty-print the JSON output')
+  .default('pretty_', false)
+  .describe('___pretty', 'Pretty-print the JSON output')
   .boolean('gen-outer-token-name')
   .default('gen-outer-token-name', false)
   .describe(

Running checks again:

pnpm run clean; pnpm run build:all; pnpm run tsc:check

Outputs:

[...]
> @nkzw/fbtee-internal@0.1.4 tsc:check /workspaces/fbtee
> tsc
// No error?

Am I missing something?


PS: in future PRs, it'd be great to focus only on code changes related to the PR goal for easier reviewing.
Code refactors like this would then be discussed in separate PRs. :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yargs allows you to provide a definition and then it returns a typed object, ie. args.pretty will be typed as "boolean".

You are right, for the description etc. though. Might make sense to snapshot the --help command.

@@ -43,6 +51,7 @@ function testTranslateNewPhrases(options: Options) {
options,
);
expect(result).toMatchSnapshot();
expect(console.error).toHaveBeenCalled();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the translation is missing, this was logging to the terminal. It's now caught by Jest and verified that it is actually reported as missing.

@@ -35,8 +35,8 @@ import {
import type { BindingName, FbtOptionConfig } from '../FbtConstants.tsx';
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly cleanup in this file.

@cpojer cpojer force-pushed the fbt-list branch 4 times, most recently from 6c0e39e to 88df75f Compare December 23, 2024 05:06
Copy link
Collaborator

@alexandernanberg alexandernanberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!

My only other input is if we should use the native Intl.ListFormat API. Global support is at ~95%.

export type ChildParentMappings = {
[childPhraseIndex: number]: ParentPhraseIndex;
};
export type ChildParentMappings = Map<number, ParentPhraseIndex>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maps serialize to an empty object, e.g. compare the .source_strings.json in the example app using this branch vs main

Main

 "childParentMappings": {
  "14": 13,
  "23": 22,
  "24": 23
 },

This branch

 "childParentMappings": {},

I assume this is unintended, but not really sure what childParentMappings is used for in the source strings payload

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️ You are right – and there are no tests for this behavior. I'll fix it and add a test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason to switch to a Map here?

I see what's happening. The internal data structure was a number-indexed JS object, and it gets serialized in JSON as a string-indexed object, hence a bit of TS gymnastic when you wanted to merge the collect.tsx output later in this PR.

@alexandernanberg

not really sure what childParentMappings is used for in the source strings payload

It's documented there but feel free to let me know if you need more info:

* Mapping of child phrase index to their parent phrase index.
* This allows us to determine which phrases (from the `phrases` field) are "top-level" strings,
* and which other phrases are its children.
* Since JSX elements can be nested, child phrases can also contain other children too.
*
* Given an fbt callsite like:
*
* <fbt desc="...">
* Welcome <b>to the <i>jungle</i></b>
* </fbt>
*
* The phrases will be:
*
* Index 0: phrase for "Welcome {=to the jungle}"
* Index 1: phrase for "to the {=jungle}"
* Index 2: phrase for "jungle"
*
* Consequently, `childParentMappings` maps from childIndex to parentIndex:
*
* ```
* "childParentMappings": {
* 1: 0,
* 2: 1,
* }
* ```
*
* The phrase at index 0 is absent from `childParentMappings`'s keys, so it's a top-level string.
* The phrase at index 1 has a parent at index 0.
* The phrase at index 2 has a parent at index 1; so it's a grand-child.

See also:

fbtee/docs/autoparam.md

Lines 87 to 92 in c8bd6b0

Furthermore, we provide a mapping `{<childIndex>: <parentIndex>}` in
the collection output `childParentMappings`. At Meta, we use
these to display all relevant inner and outer strings when translating
any given piece of text. We recommend you do the same in whatever
translation framework you use. Context is crucial for accurate
translations.

In a nutshell, it's the way to let users determine the hierarchy of fbt strings so that they can figure out what's the outer or inner strings from their fbt callsites.

For example, for the string:

<fbt desc="auto-wrap example">
  Go on an
  <a href="#">
    <span>awesome</span> vacation
  </a>
</fbt>

We'll have the phrases:

  • 'Go on an {=awesome vacation}'
  • '{=awesome} vacation'
  • 'awesome'

Assuming that users want to show these strings together to provide better UI context (in a translation GUI for example), they'll need the childParentMappings to determine how strings are related to each-other.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the thorough explanation!

@cpojer
Copy link
Collaborator Author

cpojer commented Dec 23, 2024

My only other input is if we should use the native Intl.ListFormat API.

Yeah, possibly. I haven't checked, and we can swap out the implementation if we want to, but otherwise everything else in this PR is still relevant.

@cpojer
Copy link
Collaborator Author

cpojer commented Dec 23, 2024

Changed childParentMappings so that it internally uses a Map, but then serializes into an object when it is retrieved.

@cpojer cpojer merged commit 48fa4ae into main Dec 23, 2024
1 check passed
@kayhadrin
Copy link
Collaborator

@alexandernanberg

My only other input is if we should use the native Intl.ListFormat API. Global support is at ~95%.

So here's my take about using browser-based i18n APIs (mainly shared at Meta during my tenure).

Yes, Intl APIs exist, it's great in theory but we were quickly disillusioned once we looked into the actual language support in the wild. The Intl API quality measured over several locales, over different browser engines & versions is very disparate. For example, you could easily see that Chrome supported Intl.foo in locale A, but FF or Safari didn't, etc... And even if a locale is supported by most browsers, they still rely on external i18n libraries that themselves are not always up to standards or up to date...

So we eventually chose to use the Unicode Common Locale Data Repository ("CLDR") as source of truth + our own layer of customizations so that each time we found small mistakes/inaccuracies, we could hotfix our interpretation of the CLDR.
CLDR gives us information on how to apply all sorts of variations: number formatting, conjugation, pluralization, gender, currency, time, date, etc...
You can see an example of our CLDR config files for number conjugation rules: IntlCLDRNumberType01.tsx

This allows us to provide a consistent i18n experience for all fbt-ee users regardless of what browser they have.
It'd be great if we could continue doing so but CLDR is a massive source of info and integrating it on our own would be a pretty big task. :-o

To my knowledge, there are some implementations of it in other languages like C++ so maybe there are ways to piggyback another OSS project for this.

Or else, yeah, defer to the browser Intl API as we we go... and hope for the best...

Copy link
Collaborator

@kayhadrin kayhadrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpojer thanks for making this extensive PR! 🎄
Sorry it took me a good while to review it as there were so many changes!

I think this looks good overall but in summary I'd say this:

  1. Let's try to encourage users to use non-string, non-number text. By using TS, we can push them to use fbtee types instead of plain string or number. Especially in parts we control better like the value of <fbt:param> or items from <fbt:list>.
    Also, I suggest using examples of <fbt:list> that show best practices, hence using fbt strings as item arguments. E.g.
const items = [
  <span key="item 1">
    <fbt desc="item_1">Tokyo</fbt>
  </span>,
  <b key="item 2">
    <fbt desc="item_2">London</fbt>
  </b>,
  <i key="item 3">
    <fbt desc="item_3">Vienna</fbt>
  </i>,
];
expect(
  <fbt desc="Lists">
    Available Locations:{' '}
    <fbt:list
      items={items}
      name="locations"
    />.
  </fbt>,
).toMatchSnapshot();
  1. The parent-child mapping merging algorithm in collect.tsx seems off to me and may need more testing but I appreciate your efforts to embed default fbtee strings in the general string collection process.

packages/fbtee/src/list.tsx Show resolved Hide resolved
packages/fbtee/src/list.tsx Show resolved Hide resolved
export type Delimiter = 'bullet' | 'comma' | 'semicolon';

export default function list(
items: ReadonlyArray<string | ReactElement | null | undefined>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my POV, I think we should enforce that list arguments are typed as FbtWithoutString or ReactElement only.
Otherwise, users may still show untranslated string text with this function.

Could we type this as follow?

  items: ReadonlyArray<FbtWithoutString | ReactElement | null | undefined>,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes… the only problem is that TypeScript doesn't have opaque types 🫠

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back then the Fbt type had to include the string type because of all the legacy code, but as a standalone library, fbtee could afford to remove it.
Which would effectively mean we can finally consider Fbt a properly translated string after all.
Would that be a good alternative?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove string as the result of fbt(), how would you turn it into a string then that TypeScript would be happy about?

In general, I appreciate any cleanups and tightening of the types/API. Ideally in the end we'll have something like:

  • fbt() terminates into a string that can be inserted anywhere strings are allowed (including React).
  • <fbt> is a real React component and cannot be used as a string, like for example via alert(<fbt>).

btw. I did add a kind of "opaque" TranslatedString type that works for me in Athena Crisis: https://github.com/nkzw-tech/fbtee/blob/main/packages/fbtee/src/Types.d.ts#L191

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the result from fbt() or is not meant to be a true string replacement, because String prototype methods aren't i18n-safe. E.g. .substring(), .toUpperCase(), etc... don't handle unicode characters properly from an i18n POV. (Multi byte characters can get trimmed incorrectly, capitalization only works in the basic alphabet). In general, everytime some code takes an fbt result and tries to edit or concatenate it to another string is considered code smell.

Because Fbt was invented in the context of Facebook's massive legacy code base, we attached some stringish methods to keep things going... But it doesn't have to always be like that.
We could finally make Fbt a more traditional JS class instance with only the .toString() method - although that method should be mainly used by an UI library like React instead of userland code.

Internally, "Fbt result" (IIRC FbtResultBase) was the "terminating" type and was added
to the ReactNode type definition.


fbt() terminates into a string that can be inserted anywhere strings are allowed (including React).

I agree with your overall vision. To talk in terms of types, I hope we could make the Fbt type become a pure string (which is currently known as Fbs), and ReactFbt would be the JSX-version of it that may contain nested react elements.

Comment on lines +5 to +8
// Ensure the local version of `fbs` is used instead of auto-importing `fbtee`.
// eslint-disable-next-line no-unused-expressions, @typescript-eslint/no-unused-expressions
fbs;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you shed more light on the problem you're trying to avoid here? Not very familiar with the new packaging system yet.

From my understanding, importing fbs.tsx will end up bringing fbt.tsx and pretty much the full fbtee module anyway since the babel-plugin-fbtee-auto-import transform would auto-import fbtee by default.

Comment on lines -83 to -97
const args = {
COMMON_STRINGS: 'fbt-common-path',
CUSTOM_COLLECTOR: 'custom-collector',
GEN_FBT_NODES: 'gen-fbt-nodes',
GEN_OUTER_TOKEN_NAME: 'gen-outer-token-name',
HASH: 'hash-module',
HELP: 'h',
MANIFEST: 'manifest',
OPTIONS: 'options',
PACKAGER: 'packager',
PLUGINS: 'plugins',
PRESETS: 'presets',
PRETTY: 'pretty',
TRANSFORM: 'transform',
} as const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this on my local and it doesn't seem that TS detects potential typos; which is what the previous implementation was aiming to avoid.

E.g.

diff --git a/packages/babel-plugin-fbtee/src/bin/collect.tsx b/packages/babel-plugin-fbtee/src/bin/collect.tsx
index 735d5ab1..bac57a0a 100644
--- a/packages/babel-plugin-fbtee/src/bin/collect.tsx
+++ b/packages/babel-plugin-fbtee/src/bin/collect.tsx
@@ -112,8 +112,8 @@ const argv = y
       'This is a map from {[text]: [description]}.',
   )
   .boolean('pretty')
-  .default('pretty', false)
-  .describe('pretty', 'Pretty-print the JSON output')
+  .default('pretty_', false)
+  .describe('___pretty', 'Pretty-print the JSON output')
   .boolean('gen-outer-token-name')
   .default('gen-outer-token-name', false)
   .describe(

Running checks again:

pnpm run clean; pnpm run build:all; pnpm run tsc:check

Outputs:

[...]
> @nkzw/fbtee-internal@0.1.4 tsc:check /workspaces/fbtee
> tsc
// No error?

Am I missing something?


PS: in future PRs, it'd be great to focus only on code changes related to the PR goal for easier reviewing.
Code refactors like this would then be discussed in separate PRs. :-)

Comment on lines +130 to +132
if (optionName === 'key') {
return optionName;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we start tracking changes / new features in a CHANGELOG file to facilitate assembling release notes?

Comment on lines +660 to +661
child.type === 'list' ||
(child.type === 'param' &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Food for thought: using instanceof was done voluntarily in order to have hard dependencies to the fbt construct's class (e.g. FbtParamNode or FbtListNode).
With this new code, the code dependency is loosened a little bit. We can still track things using the "list" string enum but it's less convenient and less visible from an IDE perspective IMHO.

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's a good point. I generally try to avoid instanceof as much as possible in JavaScript because you might end up with a copy of the same class and spend hours debugging it. Of course, that shouldn't happen these days or with this project, but it's something that can trip you up and TypeScript can refine the types this way too.

export type ChildParentMappings = {
[childPhraseIndex: number]: ParentPhraseIndex;
};
export type ChildParentMappings = Map<number, ParentPhraseIndex>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason to switch to a Map here?

I see what's happening. The internal data structure was a number-indexed JS object, and it gets serialized in JSON as a string-indexed object, hence a bit of TS gymnastic when you wanted to merge the collect.tsx output later in this PR.

@alexandernanberg

not really sure what childParentMappings is used for in the source strings payload

It's documented there but feel free to let me know if you need more info:

* Mapping of child phrase index to their parent phrase index.
* This allows us to determine which phrases (from the `phrases` field) are "top-level" strings,
* and which other phrases are its children.
* Since JSX elements can be nested, child phrases can also contain other children too.
*
* Given an fbt callsite like:
*
* <fbt desc="...">
* Welcome <b>to the <i>jungle</i></b>
* </fbt>
*
* The phrases will be:
*
* Index 0: phrase for "Welcome {=to the jungle}"
* Index 1: phrase for "to the {=jungle}"
* Index 2: phrase for "jungle"
*
* Consequently, `childParentMappings` maps from childIndex to parentIndex:
*
* ```
* "childParentMappings": {
* 1: 0,
* 2: 1,
* }
* ```
*
* The phrase at index 0 is absent from `childParentMappings`'s keys, so it's a top-level string.
* The phrase at index 1 has a parent at index 0.
* The phrase at index 2 has a parent at index 1; so it's a grand-child.

See also:

fbtee/docs/autoparam.md

Lines 87 to 92 in c8bd6b0

Furthermore, we provide a mapping `{<childIndex>: <parentIndex>}` in
the collection output `childParentMappings`. At Meta, we use
these to display all relevant inner and outer strings when translating
any given piece of text. We recommend you do the same in whatever
translation framework you use. Context is crucial for accurate
translations.

In a nutshell, it's the way to let users determine the hierarchy of fbt strings so that they can figure out what's the outer or inner strings from their fbt callsites.

For example, for the string:

<fbt desc="auto-wrap example">
  Go on an
  <a href="#">
    <span>awesome</span> vacation
  </a>
</fbt>

We'll have the phrases:

  • 'Go on an {=awesome vacation}'
  • '{=awesome} vacation'
  • 'awesome'

Assuming that users want to show these strings together to provide better UI context (in a translation GUI for example), they'll need the childParentMappings to determine how strings are related to each-other.

Comment on lines +225 to +228
output.childParentMappings = {
...output.childParentMappings,
...json.childParentMappings,
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not sure this code will safely merge the child-parent mapping of the fbtee library with the user's source files, because the mapping is based on the collected string indexes.
Presumably, the fbtee/Strings.json strings were extracted based on the example's react files and they'll start with index 0, which would collide with the collected strings from the user's source code.

If we want to include the default fbt source string, we should provide a list of fbt strings from the fbtee library (and that's probably packages/fbtee/**.tsx (excluding tests), and concatenate it to the list of source files provided by the user (similar situation if the user provides source code via STDIN).

Relevant source code:

if (!argv._.length) {
// No files given, read stdin as the sole input.
const stream = process.stdin;
let source = '';
stream.setEncoding('utf8');
stream.on('data', (chunk) => {
source += chunk;
});
stream.on('end', async () => {
await processSource(collector, source);
await writeOutput(collector);
});
} else {
const sources: Array<[string, string]> = [];
for (const file of argv._) {
sources.push([String(file), readFileSync(file, 'utf8')]);
}
collector.collectFromFiles(sources);
await writeOutput(collector);
}

Alternatively, we could evaluate the prospect of concatenating disparate "bundles" of collected strings from this collect.tsx script. It would involve shifting indexes in the subparts of the CollectFbtOutput type to ensure that the resulting data is coherent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good news then that the fbtee package doesn't have any mappings 😅 The mapping might be worth improving on in some form… 🤔

Comment on lines 1 to 12
#! /usr/bin/env node --experimental-strip-types --no-warnings
import { readFileSync, writeFileSync } from 'node:fs';
import { join } from 'node:path';

const fileName = join(import.meta.dirname, '../Strings.json');
const strings = JSON.parse(readFileSync(fileName, 'utf8'));

for (const key in strings.phrases) {
strings.phrases[key].filepath = 'node_modules/fbtee/Strings.json';
}

writeFileSync(fileName, JSON.stringify(strings, null, 2), 'utf8');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add this comment to the source code plz :-)

@cpojer
Copy link
Collaborator Author

cpojer commented Dec 26, 2024

Thanks for the thoughtful review. I'll make a bunch of fixes like you requested.

PS: in future PRs, it'd be great to focus only on code changes related to the PR goal for easier reviewing.
Code refactors like this would then be discussed in separate PRs. :-)

Haha, yes… 🫠

@cpojer cpojer deleted the fbt-list branch December 27, 2024 01:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants