Skip to content

Latest commit

 

History

History
324 lines (171 loc) · 23.2 KB

notes-2022-03-17.md

File metadata and controls

324 lines (171 loc) · 23.2 KB

2022-03-17 ECMA-402 Meeting

Logistics

Attendees

  • Shane Carr - Google i18n (SFC), Co-Moderator
  • Romulo Cintra - Igalia (RCA), MessageFormat Working Group Liaison
  • Eemeli Aro - Mozilla (EAO)
  • Daniel Minor - Mozilla (DLM)
  • Myles C. Maxfield - Apple (MCM)
  • Yusuke Suzuki - Apple (YSZ)
  • Richard Gibson - OpenJS Foundation (RGN)
  • Ujjwal Sharma - Igalia (USA), Co-Moderator
  • Frank Yung-Fong Tang - Google i18n, V8 (FYT)
  • Leo Balter - Salesforce (LEO)
  • Younies Mahmoud - Google i18n (YMD)
  • Zibi Braniecki - Mozilla (ZB)

Standing items

Importance of Our Work

SFC: Reminder that the work we do helps bring knowledge to underserved communities all around the world, which is especially crucial in times of turbulence.

Status Updates

Editor's Update

USA: We merged Linearize reading #650; still working to cut 2022 edition

MessageFormat Working Group

RCA : We’ve been meeting with CLDR-TC working on the three existing proposals, there is progress scoping and resolving differences between proposals

EAO: The MessageFormat 2.0 spec may not include a message bundle spec. Which means we would need to think about what this means for the ECMA proposal. Do we add parsing for a single resource, or do we create the bundle spec elsewhere?

SFC : Those are really good questions , when CLDR-TC discussion it’s over it’s a good moment to have those issues discussed

EAO: Mentioning this now because it’s likely to happen but it hasn’t happened yet.

SFC: Yes, thanks about that.

Proposal Status Changes

https://github.com/tc39/ecma402/wiki/Proposal-and-PR-Progress-Tracking

SFC: We have been making progress on MDN and Test262, also thanks to Romulo. We don’t have polyfill coverage anymore because Long isn’t working actively on it anymore. This isn’t great but I’m hoping that polyfills will come along and the tests and MDN would help drive polyfill production as well.

Some of these PRs look old and this needs another pass.

RCA: A few things were worked on more recently, will update.

USA: The Stage 3 proposals have been implemented in JSC. Is that correct ?

YSZ : Yes I’ll update the Table with to reflect the actual status of Intl.Locale Info , NumberFormat v3 and Intl Enumaration

SFC: That’s great to hear, thanks for the update.

Pull Requests

SFC: I will wait for Frank to join before discussing his pull request.

USA: If we have time we can perhaps discuss #653. It’s not normative but it’s potentially precedent setting and we could discuss it at more length. That said, there’s not enough consensus around it.

Editorial: Rename [[SegmenterGranularity]] to [[Granularity]]

#653

USA: I think we don’t have enough consensus to merge this

SFC: The position of the editors seems good to me.

Proposals and Discussion Topics

Web compatibility: Allow the string "true" for "useGrouping"?

tc39/proposal-intl-numberformat-v3#74

RCA: This is an issue that NumberFormat accepts the string but NumberFormat v3 doesn’t since GetOption was changed to GetBooleanOption. The bigger issue is that this isn’t compatible with useGrouping. There’s a few ways to deal with this, but we should still fix this somehow.

SFC: We should definitely address this if it’s a web compatibility issue. What does true correspond to?

RCA: It corresponds to “always”.

SFC: Great, we should use that then. I’m glad Andre reported it and I’m disappointed that I didn’t notice it sooner.

MCM: I think “true” is scarier than “false” since previously “false” had the same behavior as boolean true.

SFC: What’s exactly wrong here? Is it an exception or an invalid value?

YSZ: I think this is an exception.

SFC: An approach could be that all values are classified as truthy or falsy, which would recreate the same behavior as before. Either way, Web Compat is the biggest priority here. Perhaps we could generalize this somehow.

RCA: What do you mean by “generalizing”?

SFC: Previously the value was coerced to boolean so anything could appear there. The proposal changed that to only allow actual booleans and a small set of strings. This is a web compat concern and I’m glad it was brought up. We should probably continue to allow everything we allowed before and only change behavior for those special strings.

YSZ: We are supportive to change the behavior to preserve the most of compat cases by casting values to boolean

SFC: Yes, I agree with that behavior, we should do that.

Conclusion

We should preserve most of compat cases by looking for the exact string enum values and then casting everything else to a boolean.

Should InitializeNumberFormat throw a TypeError when "roundingIncrement" isn't supported? #68

tc39/proposal-intl-numberformat-v3#68

RCA: This one’s shorter. It’s just required to get some alignment with the others. The errors we throw aren’t consistent. In NFv3, we are throwing a RangeError but in dateStyle/timeStyle, we throw a TypeError. Should we be consistent with the latter or keep things as-is.

USA: We should just throw whichever error makes more sense in whichever case, right?

SFC: I don’t have an opinion here but we should be consistent.

USA: Personally, I feel that this is fine? These are different conditions and the two errors seem to make sense.

SFC: I think it’s a common problem, perhaps we need a better shared understanding of what corresponds to a RangeError and TypeError. I agree with Andre that this is the closest case in the spec when two options are conflicting with each other and we throw an error. If DTF throws a TypeError I think it makes sense to throw a RangeError in NFv3. Is that our conclusion? But USA you disagree with that, right?

USA: I think according to my personal mental model, the type of error we use here should help the programmer debug the throw. Perhaps we need a third kind of error, a ValueError in ECMAScript?

RCA: I feel that RangeError is the more appropriate type of error here, for instance we use it which the value is out of bounds like a max value being lower than a min value.

RGN: I think RangeError is mostly for single-value cases where the data is out of bounds or not in a finite list of acceptable values. I haven’t checked other uses of ECMA-402 or 262, but it seems like TypeError is more of a generic fallback. But specifically, a bug relating to interaction between multiple options does not seem like a RangeError.

SFC: Based on what Richard said, I agree with his position. A RangeError is more specific than a TypeError and so we should use a TypeError for both the latter lines. The first line should remain a RangeError. However, for the third line, an argument could be made that it is a RangeError since it basically imposes a range of 1.

RCA: I still think L24 is some kind of RangeError because of that.

SFC: We could keep L24 as a RangeError, we did this currently because ICU doesn’t support anything else here. So should we proceed with RangeError, TypeError and RangeError?

RGN: That should work.

USA: I agree. Should we record this for future proposals?

SFC: I’m glad we have this recorded in the notes, but perhaps this is something that should be added to the style guide?

USA: Sure, I’ll do so.

Conclusion

Change line 23 (the line in Andre's post) to TypeError.

USA to consider adding this to the ECMA-402 style guide.

Cutting the ES2022 release

LEO: Sorry this is coming at such short notice, but we need to make the cut for ECMA 402 2022 edition. We need to bring this cut to the next TC39 plenary for a vote and send it to the next ECMA GA in June. What I need from this group as I briefly talk to Richard (I need the other editors to also +1 this) is agreement that we agree with the current live version. If we need to let something in, we should talk about it right now. If anyone has requests, we could address that but otherwise we’ll make a cut of the spec as it is and present it to TC39 in plenary. I will make a branch to the spec and we’ll have to print the PDF which is the most annoying part of editorship but hopefully, we have some support from ECMA for this. In an ideal scenario, I’d have finished this by now, so apologies, but let’s cut the live version right away.

SFC: I think #647 is just pending approval from this group so we could wait until the end of the meeting to see if we can get that in? Apart from that, I’m happy with the rest.

USA: Apologies for not reviewing the other editorial PRs by RGN, but I think the most important ones were Intl.Segmenter and “linearize reading”. With those two done, I’m confident to move ahead.

RGN: I agree, the “linearize reading” one was the most important one but the other ones are fine. We can get them in if we can, but it’s not necessary.

LEO: I would generally try to print the spec on all three major browsers and see which one does it best, but yeah, I’ll wait for those last two PRs and then we could discuss offline and start the process of cutting the release.

Normative: Disallow '_' for calendar , referring to UTS35

#647

SFC: We discussed this PR in our meeting two months ago, YSZ had an opinion and now we have a pull request, can we merge this? YSZ, can you confirm your views on this PR?

YSZ: This looks good to me. We should make the parsing consistent with Unicode locale IDs.

SFC: What about including this in ES2022?

RGN: No objection.

USA: I think it's good.

EAO: If it’s in, it’s in!

Conclusion

Consensus to merge.

Consensus to include it in the ES 2022 cut.

Units Conversion Proposal

https://github.com/younies/unit-conversion-proposal

Presenter: YMD

Slides

YMD: Presents slides

YMD: In order to make Intl locale-aware units conversion

ZB: ???

SFC: The end goal is that we want to build our way to unit preferences, locale-aware unit conversion. ZB’s comment says that it’s not a Intl feature, which is understandable. Is this something that should be exposed in 262 or 402 or not exposed at all? I think this meets the bar for Stage 1, and we can start asking more questions within Stage 1, but this is definitely an area that should be explored.

ZB via chat: +1 to Shane, but with recognition that at this point I'd be quite strongly against this advancing to Stage 2 without substantial justification/rationale/changes.

FYT: I have a question. I don’t understand in what sense is it a proposal to 402 instead of 262? When I convert from meters to inches, it works irrespectively of the users’ locale.

YMD: The locale-aware unit conversion is related to 402, because you have the locale and the unit. For complex unit conversion, it could go into 262 or 402; we can discuss that before we advance to Stage 2. It's a valid question.

FYT: This is the part that I don’t quite understand. I read the explainer on the repo, where you convert values between units where locale never comes into the picture. All the examples have the same story, except the last part. Is the last part where it comes into play?

YMD: Yes, that’s the locale-aware unit conversion. For a US locale, we’d need to convert lengths to feet and inches.

FYT: I still don't understand. You want to convert from meter to what?

YMD: For example, you want to convert from meter to "person-height” in en-US".

FYT: So you want to find out the unit of the person-height. So anything unrelated to that part is locale-neutral. So you only need one tiny part in Intl, which is to tell me the unit for person-height. The rest has nothing to do with locale. These are two different things.

YMD: The idea is a question from Zibi… to have locale-aware conversion, we need to have all these other components. We could split this proposal into two, one for 262 and the locale-specific parts could be kept in 402.

EAO: I’m very much with FYT on this one. This seems like three different APIs cobbled together rather than one comprehensive API. One is the locale-aware conversion, the second is the mathematical conversion and the third is a number formatting essentially. I don’t think this makes sense as one single proposal. We could split it into three and then see which parts are relevant to us.

MCM: I want to discuss overall feedback on the proposal, apart from the venue. For the web platform as a whole, this proposal makes sense; it seems valuable to people. Putting it all in Intl makes sense, but putting parts of it here and parts of it elsewhere also makes sense.

SFC: My comment was specifically regarding what FYT and EAO said. Getting a unit is not the full operation. The unit varies based on the size of the value. Road units in en-US for example is yards for short distances and miles for longer distances. Sometimes there could be 3 or more units in the list. If we were to split this API up, we’d need to decide what the output would look like and what the boundaries would be. It’s not as simple as a sum of parts, but it’s all one full comprehensive API.

YMD: It's reasonable to split the single/complex unit conversion and locale-aware unit converter. I don't know if locale-aware conversion is going to be an issue with the unit preferences issue.

FYT: I understand what SFC is talking about, it’s not just a single unit, it’s a system of multiple units. My idea is that the API can have a similar divide as the 262/402 divide. In cases where we don’t need locale information, we should not keep it in 402, even if it’s locale-neutral. For example, if you’re converting from meters and feet, you could simply write tests for it, but if you throw locales into the mix, things get way more complicated.

USA: First of all, I wanted to mention that EAO mentioned, which is that we're forgetting a third component: a unit formatting API. One of the parts is to pick the unit for the region. The second part is to convert from my unit to that unit. The third part is how to display that on the screen. So that's an argument against bifurcating this. So I could use Intl to get the unit, 262 to convert, and then back to Intl to format. A simple comprehensive API may be simpler. That said, if people insist on breaking this API up, I think those three parts in those venues would make sense.

YMD: You mentioned number formatting. We have a separate proposal to improve unit formatting.

EAO: I’ve been thinking about this, and the third part in particular, the Number/Unit formatting is particularly important. Is there really another unit system than the imperial system that would pose similar issues?

YMD: There are multiple systems across the world, not just the imperial system. As a matter of fact, the US and UK imperial systems also diverge somewhat.

USA: Another thing is, how high-level do we want this API to be? Do we want an API that does unit conversion, or do we want an API that takes values in meters and localizes it, as a complete black box?

YMD: We need a lower level of abstraction for many reasons,

SFC: YMD already made such a black-boxed proposal previously, which is still in Stage 1; however, we didn’t manage to make progress on it due to the lack of user preferences. Some TC39 delegates said that the unit preferences proposal did not have sufficient value on its own without user preferences; there were concerns regarding how high quality the results would be. Some folks for example would prefer to have specific precision when dealing with the conversions. Another example would be the BMI calculator that YMD talked about in the slides. Therefore, I think there’s a strong rationale behind the proposal as it is.

MCM: I wanted to push back on the idea that user preferences are required for these related proposals. These proposals could be made in a way that user preferences could be added later. But it's easy to imagine that these APIs use a locale string without user preferences and then extended in the future.

USA: To add to MCM, I think the user preferences proposal, where you get the data, is a property of the host. There's not yet a mechanism on the web to get high-quality user preferences. JS is used in other environments where you may have access to that data. For example, the GNOME desktop. Apart from that, I wanted to mention that when I mention a black-box approach, it doesn't rule out things like custom number formatting. You can still do that by calling formatToParts and displaying the final string result yourself. I do understand why a lower-level API may be preferred.

EAO: For a use case pointing away from a black box implementation, I’d been thinking about how I’d use this when plotting values on a graph. When doing that, I believe I’d need access to the number value instead of a formatted string.

FYT: MCM mentioned user preferences. I agree with what he said; we want to support user preferences, but we want the programmer to be able to provide the user preferences. The user preference from the OS may not be the best quality. So the preference should be an explicit input. It should not be implicit. For example, the preferences may need to be synced across devices.

SFC: I don’t think anyone says that user preferences should be implicit. I think the issue in plenary was that we need to have a story about how to handle the user preferences, in a cohesive way, because unless we have a story around how units are formatted according to the users’ preferences, people who use numbering systems that diverge from their region would end up in a disadvantaged position.

FYT: When we have this output, is the output of this API an object, an array or a string? It currently says 3.37 inches, it makes little sense to me. Is it an array?

YMD: It is an array.

FYT: So it is an array of values? I see.

YMD: Can we go for Stage 1 and resolve these issues during that stage?

SFC: +1

FYT: +1

MCM: Sounds like this fits with Stage 1

Conclusion

Approved for Stage 1

Intl.Segmenter with URLs, email addresses, and acronyms

#656

SFC: (Presents issue.) I agree with RGN that we don’t want to have an infinite number of toggles on Segmenter. That said, a fairly good solution of the problem would be adding another level of granularity. There’s already two levels we see here, one used by ICU and the other in V8 and we could add the ICU case as type “token”. RGN’s comment was that this is something we need in Unicode and we can later upstream this in 402. That said, as TC39 delegates, we should be able to make the proposal on the 402 side first to motivate the work on the Unicode side.

MCM: We were thinking that it should start in Unicode and trickle down; this group could make a resolution to say that we want this, but we feel that Unicode should change first.

MCM: I'm curious about the changes that V8 made. What's happening?

FYT: I can explain what’s going on. Before Intl.Segmenter was implemented, Chromium already bundled ICU data. 5-6 years ago, in Blink, they used BreakIterator and encountered this problem. Since it was not desirable for them, they introduced a patch to this effect and then when we got data from the same source. The other issue that SFC raised is that the use case they cared about should also be addressed.

MCM: It sounds like that rather than the V8 engineers adding a new kind of iterator that's similar to word but slightly different, they just changed how word segmentation works. That means that whoever decided to make that change meant that the old mode wasn't useful. So if that's the case, maybe they should upstream their changes.

USA: To respond to that, I feel from what I've heard is that the situation is not too objective. There is one group of people, Blink engineers, who feel that the behavior in Chromium is the more useful ones, and then there are application developers who prefer the ICU behavior.

FYT: Why are there two different views? You think about CLDR. All those things were originally considered rules from a linguist point of view. How should people break human-readable text in a novel or newspaper article? Now we have URLs and email addresses that didn't exist 30 years ago. Traditional linguists don't have rules for that. These things didn't exist. But now it exists in the modern world. We should think about pushing back to CLDR and facing a new reality.

SFC: Regarding why did V8 engineers make this change, a big part of this was because Segmenter was never exposed to the web platform before. What clients feel now while building websites was not considered previously. In my understanding, V8 engineers made this change to do line breaking.

FYT: This is only for word breaking.

MCM: Word breaking is used for a lot of reasons, for instance, double clicking a word selects it.

SFC: The use case by the clients is for building a spell checker. On the other hand, the V8 engineers probably did this in order to implement something like cursor positioning or word selection, not keeping the other use-case for more linguistically correct word breaking in mind, but now it’s more relevant since it’s required in order to build applications like spell checkers.

RGN: The ICU behavior selects domain names as words. The deviation is away that and into breaking them apart. It's only visible in Chrome because other implementations don't tailor ICU in this way. So I'm not sure what you're arguing to upstream.

FYT: I'm not arguing which one is better. In one usage, they prefer to break it down, and in others, they don't.

RGN: What is the "want to break it down" world?

FYT: I'd need to track down why the change was originally made.

RGN: When you have a non-integer number with separators, ICU would want that to be without word boundaries, but I think this example does the opposite here.

MCM: On this topic, in Apple frameworks, we have a whole system for solving this prohbnlem. It's a framework called DatDetectors, and its job is to find structured items in text like phone numbers and URLs. So I feel this is a larger problem than just email addresses. It also detects flight numbers, etc. Recognizing regular numbers is also a complicated task. So I feel that this problem may be out of scope for this particular API. If you want an API that is smarter about structured data, you should use a differn toslution.

SFC: The main bit we want to solve is web compatibility. I know this is a problem since Chrome works in a different way than Firefox and WebKit.

MCM: Another solution we haven’t discussed so far is that Chrome can roll back its change and restore web compatibility.

USA: ???

SFC: As next steps, I think we should do some forensics and find out why exactly the chromium change was made, and then go to ICU/Unicode folks and present the use-cases to them. Depending on the outcome of those conversations, we should come back to this group.

Conclusion

  • FYT to investigate the V8 change
  • Come back to this group if necessary