Skip to content

Releases: cometkim/unicode-segmenter

unicode-segmenter@0.11.3

07 Dec 18:45
4a9be75
Compare
Choose a tag to compare

Patch Changes

  • a5f486f: Fix bloat in the NPM package.

    package.tgz was mostly bloated by CommonJS interop and sourcemap.

    However, sourcemap isn't necessary here as it uses sources as is,
    and the CommonJS shouldn't be different.

    Now fixed by simpler transpilation for CommoJS entries, and removed sourcemap files.
    Also removed inaccessible entries.

    So the unpacked total package size has been down to 135 KB from 250 KB

    Note: Node.js v22 will stabilize require(ESM), which will allow CommonJS projects to use this package without having to maintain separate entries. I'm very excited about that, and looking forward to it becoming more "common". The first major release may consider ending support for CommonJS entries and TypeScript's "Node" resolution.

unicode-segmenter@0.11.2

29 Nov 17:09
28c3475
Compare
Choose a tag to compare

Patch Changes

  • 94ed937: Improved perf and bundle size a bit

    It seems using TypedArray isn't helpful,
    and deref many prototypes may cause deopt.

    Array is good enough while it ensures it's packed.

  • de71269: Update Intl type definition

unicode-segmenter@0.11.1

24 Nov 03:23
f5bf190
Compare
Choose a tag to compare

Patch Changes

  • 9d688d8: grapheme: rename countGrapheme() to countGraphemes(). existing name is deprecated alias.
  • be49399: grapheme: Add splitGraphemes() utility
  • 5e86659: grapheme: add more detail to API JSDoc

unicode-segmenter@0.11.0

02 Nov 21:20
8d8cd4f
Compare
Choose a tag to compare

Minor Changes

  • ffb41fb: Code size is signaficantly reduced, minified JS now works in half

    There are also some performance improvements.
    Not that much, but getting improvement on size without giving it up is a huge win.

    • Compress Unicode data more in Base36

    • Changed the internal representation into TypedArray to improve its access pattern.

    • Shrank the grapheme lookup table size.
      This does not impact performance except for some edges like Hindi and Demonic, but it does reduce the bundle size.

  • 9e0feca: Update to Unicode® 16.0.0

unicode-segmenter@0.10.1

02 Sep 18:07
66e7f83
Compare
Choose a tag to compare

Patch Changes

  • 3665cf7: Fix Hindi text segmentation

unicode-segmenter@0.10.0

01 Sep 03:56
c1e6464
Compare
Choose a tag to compare

Minor Changes

  • 73f5e6b: Significantly reduced bundle size by compressing data table. So the grapheme segmentation library is only takes 6.6kB (gzip) or 4.4kB (brotli)!

Patch Changes

  • b045320: Fix isSMP, and add more plane utils (isSIP, isTIP, isSSP)

unicode-segmenter@0.9.2

05 Jul 05:54
03d1051
Compare
Choose a tag to compare

Patch Changes

  • 447b484: Fix polyfill to do not override existing, and also to be assigned as non-enumerable

unicode-segmenter@0.9.1

14 Jun 02:26
6d02503
Compare
Choose a tag to compare

Patch Changes

  • 04fe2fc: Fix sourcemap reference error

    • Include missing sourcemap files for transformed cjs entries
    • Remove unnecessary transforms for esm entries and remove source map reference

unicode-segmenter@0.9.0

13 Jun 19:29
56b3b74
Compare
Choose a tag to compare

Minor Changes

  • 657e31a: semi-breaking: removed _cat from grapheme cluster segments because it was useless

    Instead, added _catBegin and _catEnd as beginning/end category of segments, which are possibly useful to infer applied boundary rules.

unicode-segmenter@0.8.0

12 Jun 17:02
2e84f7f
Compare
Choose a tag to compare

Minor Changes

  • f5ec709: Deprecated isEmoji(cp) in favor of isExtendedPictogrphic(cp).

    There are no differences, but it was confused with the \p{Emoji} Unicode property.

    (Note: \p{Emoji} is not useful in actual use cases, see)

Patch Changes

  • 5bf4d29: Fix the TypeScript definition for GraphemeCategory enum