-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add spec_url data for JavaScript features #2983
Conversation
1e76365
to
4833492
Compare
Thanks for doing this work! It might take a while to get back to you and to agree how to exactly add this to the schema or the repo, but this is an excellent start. Some initial thoughts:
|
I don’t think the spec name is needed and don’t feel strongly at all about it being included. I had thought it could be useful mostly just for the purpose of making the JSON source more human-readable — so that somebody reading or editing the JSON source can quickly see what spec is being referenced. But on balance I guess that’s not worth having the extra data in there, given it’s not something for applications to consume, but instead just for people. So for now I’ll regenerate the patch without the "name" part. If we end up deciding later that it’s useful to have the "name" part in there, I could always easily change the code back later to pull the part back in again.
I strongly agree with that. So I’m glad somebody else does too :)
I agree that it’s bad for people to be reading old or snapshotted specs. So let’s figure out how to not migrate those old spec links from MDN into BCD. But the immediate problem I have is that so far my code that generates the changes in this patch just simplemindedly pulls in every link it finds in a Specifications table in an MDN article. I don’t know of a simple way for the code to programatically discern which spec link from a Specifications table is to a current spec, and which spec link is to and old or snapshotted spec. So, short of going in and removing all the links to old/snapshotted specs from all the MDN articles that have them, the only way I can see to handle it is for me to hardcode a list of old/snapshotted spec into my Python script in this patch, and to make it just drop those old ones on the floor. That’s totally do-able and the level of effort for it isn’t high — at least in most cases, I suspect I’m familiar enough with most specs to know which are the old/outdated ones — so I’m happy to add that to the patch (in the next few days when I can make time to do it).
The same reasoning that the TC39 representatives and other spec folks have for not wanting to reference old specs: If a feature is deprecated, we don’t want to be pointing readers to the whatever place it’s specified — or that it used to be specified. Because its outdated information. In the case of the HTML spec and other WHATWG specs, if a feature really is deprecated, then the spec text for it has actually already been removed from the spec — so the only thing available to link to in that case would be an old/snapshotted previous version of the spec. In fact, in a number of MDN articles I’ve edited that are for now-deprecated features, I removed the HTML spec link (because it’s now 404, since the fragment ID it linked to is no longer in the spec). So the only spec links left in the Specifications table in that case are to any other old/outdated spec versions (e.g., to some link to the W3C copy/fork of HTML — some W3C HTML 5.x version). And IMHO in that case, those links to other outdated specs/copies should rightly be removed from the Specifications table too, which I have done in some cases but not in all (part of the reason being that IMHO, MDN rightly shouldn’t be linking to those W3C 5.x copies/forks of the real HTML spec to begin with — if I had my way, none of those links would be in MDN articles to begin with…) |
Awesome. so then I would expect this to be like this: "__compat": {
"mdn_url": "https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Symbol",
"spec_url": "https://tc39.github.io/ecma262/#sec-symbol-objects",
"support": {
"chrome": {
"version_added": "38"
},
Great! So, I think instead of having a PR for all page trees at the same time, it would be nice to split up this work a little bit so it becomes more manageable and reviewable. For instance, for JS I got the word that we should only point to https://tc39.github.io/ecma262/ in most if not all cases. So, we could start with the One schema question that I think we still need to answer, is whether it is really possible to come up with a single spec for all features, or if there are cases where we actually want more than one spec to be listed. If that's the case. then we would rather have
I see, this makes sense in such cases. Not sure it's true for all deprecated features, as our definition for deprecated seems to be more like "not recommended" as you've found out in the other issue. I'm leaving #1531 here so this PR and the original issue are linked. |
I think there are cases where a feature legitimately has multiple (non-old/snapshotted/obsolete) specs associated with it. So I do think we’re going to need end up allowing for an array value. A common case where I think a feature can have multiple spec URLs is when the feature is an interface for which the base interface (IDL) is defined in one spec (e.g., in the HTML spec) but there are one or more other specs that have a partial interface/IDL definition for that interface. See https://developer.mozilla.org/en-US/docs/Web/API/Performance#Specifications for example. But that’s just one example and there are a lot of other interfaces like that — as well as other kinds of cases where a feature might legitimately have multiple spec URLs associated with it. I guess another alternative is that we could allow the field value to be either a single string or an array. But if we did it that way, then what would we name the field?
That makes me really happy to hear. I hope we can end up using that as a precedent for other cases (e.g., for HTML features, only pointing to https://html.spec.whatwg.org/).
Yep, sounds good — I can update the patch, and after that I can update the title and description of this PR (so we don’t need to close it and reopen a new one). That said, if we start by limiting this to just the
Right. But even in the case where something’s just "not recommended" rather than deprecated formally (as far as the normative spec language) — even in that case IMHO we still don’t want to be pointing readers to the spec about it. I think for the most part we want readers to act going forward as if the feature doesn’t exist — and not, by providing readers with easy access to the spec for it, contribute to further proliferation of the feature continuing to be used in web content (and risk having readers do things like citing the existence of the spec link in MDN to justify their continued use of the feature in spite of it not being recommended). |
Well true, but maybe you could also go differently about this. The Performance API is structured like this and has sub features:
Now, on https://developer.mozilla.org/en-US/docs/Web/API/Performance#Specifications, we probably want to list all of the specs that do extend the Performance interface, but we can get this information from the sub features, right? So, say, we generate spec tables from this repo in the future, and so the interface spec table could walk the whole interface and list sub features' specs automatically. From the data point of view, the individual features have one If you agree that the above case could also be dealt with by just |
Right, but as far as making that determination programatically, then I don’t think that given as input the set of specs in the Specifications that’s in the https://developer.mozilla.org/en-US/docs/Web/API/Performance#Specifications article, there’s any way to programatically determine which of those specs is the one that doesn’t define a partial interface. So the only way I could imagine it could be done is to
Those steps would fail to produce a single URL in the end for any cases of a subfeature that lacks an mdn_url of its own (which there are real cases I know of) or that lack an MDN article that actually exists at whatever the specified mdn_url is (which there are also cases I know of). But doing it that way might get us most of the way there, with only some (relatively) minimal manual editing needed to make the output have only spec URL per feature. So I can make some time to update the Python script in this patch to try to do that. Otherwise, identifying the one spec among the set of specs in a Specifications that should be the one to associate with the interface itself (rather than with one of its subfeatures) would require a human to do some relatively extensive manual editing of the output of the current Python script in this patch. Which would certainly be doable — but it seems like there’s going to be a significant level of effort needed to do that N different times for the N different cases where an MDN article has multiple specs in its Specifications table. It will take a significant amount of time. That’s not to say I don’t think it should be done — I agree that rightly each interface should probably have only one spec URL associated with it. But I’m just not volunteering to be the one who does all the manual editing that would be required to do it — because I’m not sure I can make the time to do it, nor do I even have a good idea at all at this point about how much time it would take. It seems like the work/time required would be equivalent to what’d be needed to review all the current MDN articles that have multiple specs in the Specifications tables, and to reduce them one-by-one to having just a single spec each.
Sure. The question is how well that could be done programatically.
Sure
I agree completely on all counts. I’m just not sure yet if/how well I can programatically generate the data with those characteristics without doing (a lot of) manual editing of the output. But I’ll give it a try and see how close I can get.
Yes, agreed. But going in and doing that investigation is something else that’s going to take additional time, so I’m not sure yet how soon I might be able to get around to doing it. I still have a vague sense that we’re eventually going to find that there really are legitimate cases where we’ll find we need to have multiple spec URLs for a feature — but I’ll be very happy if I end up being proven wrong |
Thanks @sideshowbarker, I don't want to load all the work on you here, it's just that we should come to some agreement on how we go about this. I'm happy to help with this project, as I believe it will make this dataset a lot more powerful (after this you have a mapping between compat-data, specs and mdn docs, which I think doesn't exist at all on the internet and could be powerful). So, to reduce the workload we could split this up to work in chunks and you don't have to do all the parts here. I think we should give |
Yes — I think what we’ll eventually have from this is something that’ll end up being useful in ways we’re not even anticipating yet
OK, yeah agreed — that definitely gives a manageable scope to start out from
Sounds good to me! So for now I’ll update the the patch in this PR soon to reduce it to just the JS features, with a single spec URL for each |
One case I’ve found so far: https://developer.mozilla.org/en-US/docs/Web/API/AudioNode/connect That essentially links to two different signatures of the Anyway, it occurs to me that a general case where it seems we need |
Thanks, interesting case! Sometimes in bcd we nest a level deeper and add sub features, like it is done here for "Promise-based syntax" https://developer.mozilla.org/en-US/docs/Web/API/BaseAudioContext/decodeAudioData#Browser_compatibility Now, a sub feature like "Promise-based syntax" can again come with an mdn_url or a spec_url if needed. I wonder if that would solve this case as we likely want to surface both signatures in BCD anyway? |
I think in cases where that’s really necessary, it’s a useful solution. But for this case it’s not clear to me it’s really necessary. It seems like it could just be adding additional complexity without clear benefit. IMHO for this case, simply having multiple spec URLs would hit the sweet spot as far as providing what’s needed without introducing more complexity than needed.
It’s not clear to me that for BCD (as opposed to the actual prose of the docs) it’s necessary to surface compat data for all possible signatures in cases where there are multiple signatures. The prose content of the docs themselves can provide the information about the different signatures and their parameters. I really wonder if there are actually any cases where method has multiple signatures and a browser supports one particular signature of the method by not any others. If there aren’t any actual cases of that, then it seems like it would be overkill to bake in additional complexity to handle such cases. |
@Elchi3 looking at just the JavaScript features, it seems to me there is another type of case where it appears necessary to allow multiple spec URLs. So I’d like to get your decision about how to handle it. See https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array/toLocaleString The Specifications table in that article has links to two current specs:
The text of the ECMA 402 spec section there says this:
…which on the face of it would seem to suggest we should then just drop the ECMA 262 spec link. However, reading the text at https://tc39.github.io/ecma262/#sec-array.prototype.tolocalestring, I see:
OK so far, because that’s restating what the ECMA 402 spec also says: ECMA 402 supersedes this. But the next sentence is this:
That seems to be an important statement. It makes this part of ECMA 262 non-redundant with 402. Further, it says:
So, given there are in fact implementations that do not include ECMA-402 support, then it seems necessary to continue to cite both spec URLs — because the information in ECMA-262 is necessary for anyone wanting to know what the required behavior is for any implementation that doesn’t support ECMA-402. (And of course the information in ECMA-402 is necessary for any anyone wanting to know what the required behavior is for any implementation that does support ECMA-402.) So it seems to me that for this case, there’s no single spec URL that we can choose to link in BCD. And in addition to
|
Thanks for your continued work on this, @sideshowbarker. This is another really interesting case and maybe there it would make sense for toLocaleString to have both specs in the data. If I would be forced on a single url per feature, I would probably have the data like this: (the main feature has the ecma262 link, and the two subfeatures have the ecma402 link) {
"javascript": {
"builtins": {
"Array": {
"__compat": {
"mdn_url": "https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/toLocaleString",
"spec_url": "https://tc39.github.io/ecma262/#sec-array.prototype.tolocalestring",
"support": {
"chrome": {
"version_added": "38"
},
"locales": {
"__compat": {
"description": "Optional <code>locales</code> parameter",
"spec_url": "https://tc39.github.io/ecma402/#sup-array.prototype.tolocalestring",
"support": {
"chrome": {
"version_added": null
},
}
},
"options": {
"__compat": {
"description": "Optional <code>options</code> parameter",
"spec_url": "https://tc39.github.io/ecma402/#sup-array.prototype.tolocalestring",
"support": {
"chrome": {
"version_added": null
},
}
}, However, I can see that it would also make sense to list both for the main feature. So lets assume we would introduce {
"javascript": {
"builtins": {
"Array": {
"__compat": {
"mdn_url": "https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/toLocaleString",
"spec_urls": [
"https://tc39.github.io/ecma262/#sec-array.prototype.tolocalestring",
"https://tc39.github.io/ecma402/#sup-array.prototype.tolocalestring"
],
"support": {
"chrome": {
"version_added": "38"
},
"locales": {
"__compat": {
"description": "Optional <code>locales</code> parameter",
"spec_urls": [
"https://tc39.github.io/ecma402/#sup-array.prototype.tolocalestring",
],
"support": {
"chrome": {
"version_added": null
},
}
},
"options": {
"__compat": {
"description": "Optional <code>options</code> parameter",
"spec_urls": [
"https://tc39.github.io/ecma402/#sup-array.prototype.tolocalestring",
],
"support": {
"chrome": {
"version_added": null
},
}
}, So, I still believe multiple specs would be an edge case, so maybe we should also allow I still kind of wish we could have |
I think an array makes sense. I don’t think we need an object structure.
I think it doesn’t matter. But if we decide it does matter later, I guess we could adopt a convention where the first URL is whatever’s seems to be the primary spec reference for the feature. But as far as programmatically seeding the BCD data from what’s in MDN now, the order will just end up being whatever the order is currently in the Specifications table it’s pulled from.
Yes, agreed. It will be.
Yes. I can update the JSON schema to make it allow both a string or array for the value.
Yes, I think
Summary I get from our discussion is that it seems to make the most sense to have a I’ll go ahead and generate a patch with it that way — just for the JavaScript features for now (as we’d discussed here previously) — and then we can take a look at the result of that. |
Sounds good to me. Lets try this out. Thank you again for all your work! |
4833492
to
9aa1dd2
Compare
9aa1dd2
to
840e878
Compare
OK, (force) pushed 840e878 to update this PR branch, so the patch here is now restricted to just updating the JavaScript features — with the
Thanks for your guidance and feedback on making the patch what it should be |
840e878
to
fbeec5b
Compare
Just now force pushed fbeec5b as an update, to get pulls some URLs from MDN for a few cases, after I updated the spec URLs in the corresponding MDN articles. Note that if you’re interested in doublechecking yourself the behavior of the Python script I made to generate the patch, you can run it like this:
|
ec534d6
to
eca786d
Compare
I did some reviewing today, will attempt to do more soon. One thing that I like to see changed is that the |
eca786d
to
73ba6d5
Compare
Ah yeah, agreed — that’s where I should have thought to put it to begin with Anyway, I’ve made it so now — thanks! |
73ba6d5
to
11740bd
Compare
Per mdn#2983 (comment) review comments. This change should be squashed before this is merged to master.
OK, updated the script to no longer drop those. See 6b35508
After running the updated script, all those do now have spec_url
OK, updated the script to go one level deeper (5 levels).
After running the updated script, all those do now have spec_url
Yeah, I think that case (and any others like it that we might come across) would need to be done with a manual edit to the BCD source. That object doesn’t have a Although it seems like that could be addressed by adding an On that note, 6b35508 also includes another change which is: It ensures that So if we want those subfeatures to have their own spec URLs, it’s another case where we’d need to make manual edits to the BCD source to add them — because unless they’re each actually separate articles in MDN (rather than separate list items in the same article), then there’s no way in MDN to associate separate Specifications information with each of them. |
@Elchi3 Is this waiting on some action from me…? |
No, it is waiting on me. I'm trying to get back to this, but I'm swamped with work. Thanks for your patience. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work on this!
- I have one minor schema change
- I guess for finally committing this, we don't want to commit the add-specs.py file?
I'm not yet 100% sure I understand the interdiff: 6b35508
It does remove some things that look strange to me. Like the removals in javascript/operators/logical.json. Or am I not reading the interdiff and total diff right?
To speed up review could you create something like this gist https://gist.github.com/Elchi3/7bd702e7b89692afa709de76903b1b7f but also with the bcd "query strings".
So, like this:
javascript.builtins.array.pop: https://tc39.github.io/ecma262/#sec-array.prototype.pop
javascript.builtins.array.push: https://tc39.github.io/ecma262/#sec-array.prototype.push
javascript.builtins.array.push.something: <no-spec-url>
javascript.builtins.something: <no-spec-url>
That way, we could quickly see which item gets which spec_url and also when we provide no spec-url.
Thanks again, this is so cool, and I will try to get it merged before I go into holidays later in December.
Those are due to the fact I updated the script to ignore any mdn_url that has a fragment ID, for the reason explained in #2983 (comment):
But I see that javascript/operators/logical.json doesn’t actually have a single parent feature; instead it has just those three subfeatures, all of which are for documented in the same MDN article but just in separate sections. That’s… atypical. It seems to me to expose some problems with how the information is structured in MDN and how the data is structured in BCD — and suggests that maybe the MDN article should be split into separate articles, and/or the BCD data should be structured different.y Otherwise I’m not sure how to address that case. But if you prefer, I can just remove the logic from the script that causes any mdn_url to be ignored if it contains a fragment ID — but then that’s going to cause the general problem described in #2983 (comment). |
Yup — will get that generated some time this weeken |
Created at https://github.com/w3c/browser-compat-data/wiki/JavaScript-spec_url-list |
7b6ea8c
to
e7094ab
Compare
Per mdn#2983 (comment) review comments. This change should be squashed before this is merged to master.
Yeah so in 878552d I’ve gone ahead and removed it |
This change adds spec URLs in the `*.json` sources for all JavaScript features that have an `mdn_url` for an MDN article with a **Specification(s)** table — with the exception that no spec data is added for any cases where a URL found in an MDN **Specification(s)** table has no fragment-ID part. The new field that holds the spec-URL data is named `spec_url`, and its value is either a single URL (in the case where the feature is only associated with a single specification references), or any array of URLs (in the case where the feature is associated with multiple specification references). The change also updates `schemas/compat-data.schema.json` to allow the additional spec data and to specify the structure it must conform to.
e7094ab
to
604dbe9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this initial landing. Some things might need manual adding, but that's fine I think.
Happy New Year to you @sideshowbarker and thanks for your patience with this. 👍
@Elchi3 🎉 thanks much! |
This change adds spec URLs in the
*.json
sources for all JavaScript features that have anmdn_url
for an MDN article with a Specification(s) table — modulo the following exception:The new field that holds the spec-URL data is named
spec_url
, and its value is either a single URL (in the case where the feature is only associated with a single specification references), or any array of URLs (in the case where the feature is associated with multiple specification references).The change also:
Includes an
add-specs.py
script that can be used to (re)generate all the spec data and update all the*.json
sources. (The script works by scraping MDN Specification(s) tables.)Updates
schemas/compat-data.schema.json
to allow the additional spec data and to specify the structure it must conform to.