Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON cannot be a subset of JavaScript #1681

Closed
erights opened this issue Aug 29, 2019 · 26 comments · Fixed by #2257
Closed

JSON cannot be a subset of JavaScript #1681

erights opened this issue Aug 29, 2019 · 26 comments · Fixed by #2257

Comments

@erights
Copy link

erights commented Aug 29, 2019

In reforming the JavaScript treatment of \u2028 and \u2029, we frequently described the rationale as ensuring that JSON is a proper subset of JavaScript. Indeed, this brings it much closer. But we cannot repair the remaining discrepancy:

[{"__proto__": []}]

As JavaScript, this makes a singleton array whose element inherits from an empty array. As JSON, this makes a singleton array whose element inherits from Object.prototype and has an own data property named __proto__ whose value is an empty array.

Because of this, JSON will forever only approximate a subset of JavaScript.

@erights
Copy link
Author

erights commented Aug 29, 2019

FWIW, I put an array on the outside to avoid the issue that JSON is approximates a subset of the JavaScript expression grammar, which is distinct from any JavaScript start production.

@devsnek
Copy link
Member

devsnek commented Aug 29, 2019

I feel like the "subset" part is about syntax, not runtime behaviour.

@erights
Copy link
Author

erights commented Aug 29, 2019

I feel like the "subset" part is about syntax, not runtime behaviour.

If language X's syntax is a subset of language Y's syntax, but X and Y associate these with, say, completely different meanings, I would say only "X's syntax is a subset of language Y's syntax."

A language consists of a lot more than syntax.

@erights
Copy link
Author

erights commented Aug 29, 2019

This misunderstanding --- which I shared --- caused https://github.com/Agoric/SES/issues/147

We should avoid over-simplistic statements that lead people to write buggy programs.

@bakkot
Copy link
Contributor

bakkot commented Aug 29, 2019

I wonder how often code actually uses "__proto__": rather than __proto__:.

Out of historical interest, is there a reason this syntax was not restricted to sloppy mode? At least for the quoted case?

@erights
Copy link
Author

erights commented Aug 29, 2019

I don't recall ever considering making a strict / sloppy distinction here.

Attn @allenwb Do you remember?

@allenwb
Copy link
Member

allenwb commented Aug 29, 2019

Out of historical interest, is there a reason this syntax was not restricted to sloppy mode?

Because TC39 decided that proto in an object literal was the de factor standard for declaratively setting its [[Prototype]] and there is no point in ever inventing some other syntax. Applies to both strict and sloppy.

We potentially could have excluded recognizing stirng literal "proto" as special. But I don't think at the time anybody ever brought it up as a JSON subset issue. We did exclude computed property names.

@ljharb
Copy link
Member

ljharb commented Aug 29, 2019

Is there any chance it would be web compatible to exclude the string literal form now, even if just in strict mode?

@devsnek
Copy link
Member

devsnek commented Aug 29, 2019

given that the consistent quotes eslint rule exists, i'd assume it will break at least a few sites.

@claudepache
Copy link
Contributor

FTR, the incorrect assumption that JSON is a subset of JS has led to implementation bug in some JS engine: https://bugzilla.mozilla.org/show_bug.cgi?id=1337564

@erights
Copy link
Author

erights commented Aug 30, 2019

given that the consistent quotes eslint rule exists, i'd assume it will break at least a few sites.

Does the consistent quotes rule complain about

{__proto__: ...}

?

@kumavis
Copy link

kumavis commented Aug 30, 2019

@erights i believe this the rule, includes examples https://eslint.org/docs/rules/quote-props#consistent

@gibson042
Copy link
Contributor

proposal-json-superset was careful to describe the post-change language as a syntactic superset of JSON for precisely this reason. The semantic discrepancy was explicitly mentioned and survives in the definition of JSON.parse as "extended PropertyDefinitionEvaluation semantics defined in B.3.1 must not be used".

What would make that more clear?

@claudepache
Copy link
Contributor

What would make that more clear?

The problem does not come from what is said, but from what is not said.

There is a note in the spec (precisely at the end of this section) stating that “valid JSON text” is a subset of ES PrimaryExpression syntax. It should be added that nevertheless, such a text, when interpreted by JSON.parse() or when interpreted as ES PrimaryExpression, does not always produce the same value.

@allenwb
Copy link
Member

allenwb commented Aug 31, 2019

Let me take my crack at explaining this:

JSON is a syntax for data interchange. See the title of its standard. That syntax is indeed (now) a small subset of ES PrimaryExpression.

JSON has no intrinsic semantics. Various semantics can be imposed upon the JSON syntax by readers and writers of a valid "JSON text".

JSON.parse is one such reader that defines a specific semantics. That semantics is just one of an open set of possible JSON semantics. The JSON.parse semantics interprets a JSON text as a description of a data structure composed of JavaScript objects and primitive values. This semantics is similar to, but in various ways quite different (don't forget the reviver) from the semantics applied by the JavaScript eval function (or a JavaScript compiler/runtime) to a textually identical ES PrimaryExpression.

JavaScript programs are not required to use JSON.parse and JSON.stringify to process JSON texts. They can always suppiy their own readers/writers that impose a different semantics upon the JSON syntax. That semantics might be a minor variations of the JSON.parse / JSON.stringify semantics or it might be completely different. Similarly, future editions of ECMAScript standards might choose to specify additional JSON processors that impose a different semantics.

JSON is not "a subset of JavaScript" . It's semantics is not (uniquely) defined by JSON.parse and JSON.stringify. JSON is a data interchange syntax and meaningful use required mutual agreement between a producer and consumer on a particular semantics.

@erights
Copy link
Author

erights commented Aug 31, 2019

@allenwb , while everything you say is true, we still have the hazard that I and others fell into, by remembering something simpler that is almost true. I still think we need the clarifying text that @claudepache proposes.

@allenwb
Copy link
Member

allenwb commented Aug 31, 2019

@claudepache It should probably also be noted that the use of ES parse and evaluation semantics in steps 3 and 4 of JSON.parse is just a hack that I came up with to avoid having to write the specification of a complete JSON reader semantics. It was convenient to piggy-back on the the evaluation mechanisms that were already in the spec.

The JSON.parse could have been written more verbosely in a completely different manner that would not have required any mentions of Annex 3.1.

@erights If you want to clarify it in the spec I suggest incorporating the text I just wrote above as an informative note. I agree that experience suggests that people have to be continually wacked on the head to educate/remind them that JSON.parse/'JSON.stringify is not the JSON semantics.

@bergus
Copy link

bergus commented Sep 5, 2019

@allenwb

TC39 decided that __proto__ in an object literal was the de factor standard for declaratively setting its [[Prototype]] and there is no point in ever inventing some other syntax.

Could you share a link to the meeting notes containing that decision, if you can find it, please?
FWIW, I can still remember the prototype operator that would have allowed setting the [[prototype]] of any other literal, including functions, arrays or regular expressions, not just objects :-)

@littledan
Copy link
Member

It sounds like we're looking at a communication problem, rather than any possible change to syntax or semantics of JavaScript or JSON. I wonder if we could work with devrel and educator-type folks to avoid these sorts of misunderstandings. cc @bkardell @mathiasbynens

@mathiasbynens
Copy link
Member

The nuance that JSON is only a syntactic subset of ECMAScript is important. As @gibson042 says, we took special care to mention this explicitly in the proposal's README and in V8's developer-facing documentation.

We could still add the note that @claudepache mentioned in #1681 (comment) to the spec.

@ra1u
Copy link

ra1u commented Nov 18, 2019

How can we get parser in standard that is conforming with JSON specs in terms of being able to represent any JSON object?

Is there a reason to fix standard instead of implementation?

@claudepache
Copy link
Contributor

@ra1u

Is there a reason to fix standard instead of implementation?

The primary (and maybe only) reason is backward compatibility. Implementations are not willing to remove the special handling of the __proto__ pseudo-key in object literals, because otherwise they’ll break the web.

@ljharb
Copy link
Member

ljharb commented Jan 3, 2020

Seems like to close this, we need a PR to add the note mentioned here: #1681 (comment)

@AFatNiBBa
Copy link

JSON.stringify could just wrap the "__proto__" in square brackets:

const a = { ["__proto__"]: 1 };

This will set the actual __proto__ key on the object

@ra1u
Copy link

ra1u commented Oct 23, 2021

Because of this, JSON will forever only approximate a subset of JavaScript.

Other gotcha:

Json allows any finite decimal number being re-presentable, but not in JS, becasue JS standard has limited way to represent decimal numbers as decimal numbers are bounded by precision. Large set of numbers that is representable in Json is not representable in Javascript trough standard Json parser.
I dont see better way to have support for Json numbers other than writing own valid Json parser as current hooks does not allow for such manipulation.

@bakkot
Copy link
Contributor

bakkot commented Oct 24, 2021

JSON.stringify could just wrap the "__proto__" in square brackets:

That isn't legal JSON, so no, it can't.

I dont see better way to have support for Json numbers other than writing own valid Json parser as current hooks does not allow for such manipulation.

You may be interested in this proposal, which addresses precisely that. (Though it will still be true that there are numbers representable in JSON but not as either Numbers or BigInts in JavaScript.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.