-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly format normalized path names #113
Conversation
The RFC defines how to [format normalized path names]. Notably, it defines the escape character as `ESC normal-escapable`, but doesn't define `ESC` in that section. It is, however, defined for [name selector syntax] as `%x5C; \ backslash`. I'm therefore reasonably sure that the current behavior of serde_json_path when formatting normalized paths is incorrect, as it results in no escaping of `'` at all. For example, using [the sandbox] to select `$.*` from `{"O'Reilly": true}` returns ```json [ { "loc": "$['O'Reilly']", "node": true } ] ``` I worked this out while building the location feature into [Go jsonpath] (many thanks for the inspiration in your crate!). This [playground link] shows what I believe to be the correct escaping (doubled for inclusion in JSON): ``` json [ { "node": true, "path": "$['O\\'Reilly']" } ] ``` So expand `Display for PathElement` to properly format characters strictly according to the spec. Iterate over each character in a name and escape exactly the characters it defines to be escaped. [format normalized path names]: https://www.rfc-editor.org/rfc/rfc9535#section-2.7 [name selector syntax]: https://www.rfc-editor.org/rfc/rfc9535#name-syntax-3 [the sandbox]: https://serdejsonpath.live [Go jsonpath]: https://github.com/theory/jsonpath [playground link]: https://theory.github.io/jsonpath/?p=%2524.*&j=%257B%2522O%27Reilly%2522%253A%2520true%257D&o=1
d603cee
to
c56a4e5
Compare
Great find @theory and thanks for opening the issue! Another important thing that this made me realize is that I don't think the compliance test suite is providing a set of tests for validating how a library forms normalized paths. I will see if I can find time to sort this out, but for now added some labels to 🎣 if anyone wants to try and fix it. |
This PR should fix it. |
Oh 🤦 I thought this was an issue, and not a PR. I will take a look... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @theory - really appreciate the PR 🎉
The RFC defines how to format normalized path names. Notably, it defines the escape character as
ESC normal-escapable
, but doesn't defineESC
in that section. It is, however, defined for name selector syntax as%x5C; \ backslash
. I'm therefore reasonably sure that the current behavior of serde_json_path when formatting normalized paths is incorrect, as it results in no escaping of'
at all.For example, using the sandbox to select
$.*
from{"O'Reilly": true}
returnsI worked this out while building the location feature into Go jsonpath (many thanks for the inspiration in your crate!). This playground link shows what I believe to be the correct escaping (doubled for inclusion in JSON):
So expand
Display for PathElement
to properly format characters strictly according to the spec. Iterate over each character in a name and escape exactly the characters it defines to be escaped.