Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intl.DateTimeFormat.prototype.format outputs a string with a unicode <space> instead of an ASCII <space> #45938

Closed
dunnza opened this issue Dec 21, 2022 · 6 comments

Comments

@dunnza
Copy link

dunnza commented Dec 21, 2022

Version

v19.1.0

Platform

Darwin 22.1.0 Darwin Kernel Version 22.1.0: Sun Oct 9 20:14:54 PDT 2022; root:xnu-8792.41.9~2/RELEASE_X86_64 x86_64

Subsystem

No response

What steps will reproduce the bug?

const formatter = new Intl.DateTimeFormat('en-US', { hour: 'numeric', minute: '2-digit', hour12: true, timezone: 'America/Chicago' });
const output = formatter.format(new Date(2022, 11, 21, 8, 0, 0));
console.log(output); // '8:00 AM'

console.log(/ /.test(output)); // false
console.log(/[^\u0000-\u00ff]/.test(output)); // true

const replaced = output.replace(' ', '');
console.log(replaced); // '8:00 AM'

How often does it reproduce? Is there a required condition?

Every time as far as I can tell

What is the expected behavior?

const formatter = new Intl.DateTimeFormat('en-US', { hour: 'numeric', minute: '2-digit', hour12: true, timezone: 'America/Chicago' });
const output = formatter.format(new Date(2022, 11, 21, 8, 0, 0));
console.log(output); // '8:00 AM'

console.log(/ /.test(output)); // true

const expected = output.replace(' ', '');
console.log(expected); // '8:00AM'

What do you see instead?

const formatter = new Intl.DateTimeFormat('en-US', { hour: 'numeric', minute: '2-digit', hour12: true, timezone: 'America/Chicago' });
const output = formatter.format(new Date(2022, 11, 21, 8, 0, 0));
console.log(output); // '8:00 AM'

console.log(/ /.test(output)); // false

const observed = output.replace(' ', '');
console.log(observed); // '8:00 AM'

Additional information

No response

@addaleax
Copy link
Member

Related ticket/comment: #45171 (comment)

It's not obvious to me what's wrong with the newer behavior, though.

@bnoordhuis
Copy link
Member

bnoordhuis commented Dec 22, 2022

This was fixed in v19.2.0, V8's date parser now recognizes U+202F.

edit: I assume the / /.test(output) was for expository reasons only because that was never a good assumption with localized dates.

@dunnza
Copy link
Author

dunnza commented Dec 22, 2022

I'm testing it on v19.2.0 and I'm still getting the issue.

edit: I assume the / /.test(output) was for expository reasons only because that was never a good assumption with localized dates.

Maybe, but it's not intuitive why it doesn't work? Given other examples:

// in v19.2.0 using `output` from the original example ("8:00 AM")
/ /.test(output); // false
/\u202F/.test(output); // true
/\u0041/.test(output); // true (the letter 'A')
/A/.test(output); // true

I guess I would have thought /A/.test(output) would fail just like the test for space?

One weird thing about this behavior is that it deviates from how the browser does it? The most up-to-date Chrome, Firefox, and Safari all allow output.replace(' ', '') to work as (I) expected. I guess I don't know what version of v8 my Chrome is running but I would think they wouldn't change that for backwards compatibility reasons?

My real-world use case is that in our app we styleize the times as 8:00a instead of 8:00 AM. And to do this we replace the space. This works fine in the browser but my tests break when I run them with v19.

@addaleax
Copy link
Member

addaleax commented Dec 22, 2022

I guess I would have thought /A/.test(output) would fail just like the test for space?

Why?

I guess I don't know what version of v8 my Chrome is running but I would think they wouldn't change that for backwards compatibility reasons?

The question is less your V8 version and more the ICU version that’s being shipped. Chrome has also upgrade to ICU 72.1 now so I’d expect at some point they’ll adapt the same behavior.

And to do this we replace the space.

As Ben says, programmatically relying on localized date formats is tricky, and you may not want to be using Intl.DateTimeFormat in the first place if you really just want to re-format the output. In any case, using \s or \s+ inside the regexp should solve this issue for you and just be more robust in general.

@targos
Copy link
Member

targos commented Dec 22, 2022

Chrome Canary (at least) updated to ICU 72.1 and shows the same behavior as Node.js 19

@dunnza
Copy link
Author

dunnza commented Dec 22, 2022

I guess I would have thought /A/.test(output) would fail just like the test for space?

Why?

I'm quite out of my depth here. 😅 I guess I assumed that the characters I type into a string were ASCII and then maybe mapped to their unicode counterparts or something? But that wouldn't make sense since I can paste unicode characters into strings...

For me it was just frustrating that there are multiple space characters and the one that I type in with my space bar is not the same that is output from Intl.DateTimeFormat when there's really no way to see it just by looking at console logs and things like that. 😁

Thanks for walking me through this, I'll update our code to rip out whitespace rather than just a space!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants