-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add reserved namespace bindings. #545
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, some minor notes for maintainability. Also, please add an entry to Changelog.md
and probably one test to unit tests with a whole non-UTF-8 XML to ensure, that our expectations really works
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the GitHub App Integration for your organization. Read more. @@ Coverage Diff @@
## master #545 +/- ##
==========================================
+ Coverage 64.22% 64.44% +0.21%
==========================================
Files 36 36
Lines 17083 17284 +201
==========================================
+ Hits 10972 11138 +166
- Misses 6111 6146 +35
Flags with carried forward coverage won't be shown. Click here to find out more.
|
088f041
to
62c6215
Compare
Okay, so I have declared the predefined namespaces in what I believe is a better way and added some of the docs you asked for. Is this heading in the right direction? By my calculation, the following still needs to be done:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For override tests: check the behavior of other mature XML libraries, for example, Java. Then write the tests in tests/namespaces.rs
. For example, check this XML document:
<root xml:attribute="ok">
<xml:element/>
<override xmlns:xml="overrided non-overradable prefix" xml:attribute="oh!">
<xml:element/>
</override>
<!-- remove standard prefix? -->
<reset xmlns:xml="" xml:attribute="eh?">
<xml:element/>
</reset>
</root>
How other libraries will parse that document? If it will be parsed, what the QName
s will be returned?
It also would be excellent if you add test to tests/encoding.rs
with some non-UTF-8 document and check that we correctly process xml:
and xmlns:
attributes:
- check that we detect and use non-UTF-8 encoding by querying
.decoder()
- check that we element and attribute with undeclared prefix resolved correctly (well, we can even made
XML_NAMESPACE
andXMLNS_NAMESPACE
our public API -- as aNamespace
constants) - at the same time check, that other undeclared prefix will resolved as unknown
Just so we know, the std versions of the oncecell stuff that I used in this PR is being stabilized as we speak. Part of the work landed in 1.70. The lazy initialized types have not yet landed in stable. However, that may happen in one of the next couple releases. My plan is to update this PR when that lands. |
@wt Even once it's merged, I don't know that we want to immediately bump the minimum supported Rust version up to the latest one. If that is the only thing holding this PR up, if it's ready and @Mingun likes it, then I would say just merge it now with You can leave a comment alongside, or perhaps file an issue, as a reminder to do so. |
@wt Do you intend to finish this work? I'm going to draft it for now. |
Can we just use the once_cell crate via this then? This crate will provide backward compatibility until this is stabilized, and it has been around for while. |
Yup that's exactly what I'm suggesting. Use the |
62c6215
to
2e32b59
Compare
Also this should have a changelog entry |
Looks like I'm fine with bumping the MSRV up to 1.60 (released April 2022). Most linux distros will have something newer than that but still a few versions old which is the main reason to not bump all the way to the current release. The MSRV bump should get a changelog entry also. |
Okay, so I hammered out a test xml parser with Xerces. It does in fact map to the well known namespaces to their appropriate URIs the same way this change does. See this for more. This doc: <document
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xsi:noNamespaceSchemaLocation='document.xsd'
xml:id='fjdlksfjds'>
</document> produces the following parsed attrs for the document tag:
Also, the failing tests appear to not be caused by my change. Do you agree with this? I am also trying to figure out how to address your comment about checking the encoding. Can you help me figure out what it is that you want there? |
45a9e57
to
65982bc
Compare
FWIW, Xerces also does not allow overriding the xml namespace to another URI or to no URI. I guess this is the behavior we would want in quick-xml? And I added a changelog. Here's a possible task list to finish this change:
Does that cover everything? |
bc413c6
to
c530f9d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just use the once_cell crate via this then? This crate will provide backward compatibility until this is stabilized, and it has been around for while.
There is no need to use hashmap and, as a result, once_cell
, or Lazy
, at all. Just two entries can be just hardcoded in place of use via match
, or you can use &[(&str, &str)]
constant and use linear search -- this will be faster that any map on such sizes and will not require lazy construction and additional dependencies.
And I still prefer to test each well-known prefix individually (one test for prefix, you can write a small macro if you don't like to duplicate code).
Those failures appeared since serde v1.0.181 due to serde-rs/serde#2505:
I've create #630 for investigation. |
The MSRV test is weird. Why is is failing for 1.52 when the min stated version is 1.60 in this change? |
It's hardcoded https://github.com/tafia/quick-xml/blob/master/.github/workflows/rust.yml#L17 But I suppose it's not necessary to change if you go w/ the linear search approach. With only a small number of strings to check, @Mingun is correct that it's not really going to be worth the Hashmap |
I'll rewrite without the HashMap stuff and change the tests to be now explicit. Do y'all agree that the semantics around using the xml and xmlns prefixes should be enforced as they are in Java and Xerces? |
d0d94eb
to
7855788
Compare
5f170cd
to
5d4244f
Compare
rebased |
8ed0c73
to
b03ef36
Compare
Fixed the lint issues for the compilation. I ran the individual tests, but no |
e775af5
to
b2f1be3
Compare
src/errors.rs
Outdated
#[derive(Clone, Debug)] | ||
pub struct ReservedNamespacePrefixError { | ||
pub prefix: String, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason to make separate structs instead of directly embedding the strings in the enum variants? I don't think we use this pattern in many, if any, places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've collapsed them into a single struct. I think improving the errors should probably be left for another change. I hope this addresses your concern.
6a7a543
to
1c7cf98
Compare
Okay, I think the latest version is ready. I've added tests for adding namespace contexts to the resolver. I added these specific tests:
I believe that this is ready to a review. @Mingun would you mind taking a look? I would love to get this landed soon. |
1c7cf98
to
9887b68
Compare
What needs to be done to close out this change? |
Don't worry, I don't forgot about this. I was on vacation so tracked issues very lazily. I'll look during this week. |
Thanks for working on this PR and investigating all stuff about handling this in other libraries! I'm not finishing my review yet. Instead of giving feedback in the form of comments, I decided that it would be easier to implement my vision myself. Changes I have broken down into several commits so you can see the reason for them. I structurized tests and added missing parts (what became apparent after structuring). I also simplified checks and now, I think, it is more easily to understand the code. I plan to finish review next week. |
I'm glad you were able to find some improvement. I'm not picky as long as the functionality is there. When do you think this will land in the main branch? |
I think this week |
a76d401
to
2643ebf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think, it is done. I'll merge it next few days if there will be no objections
No need to store it outside of resolver, because its lifetime is the same as resolver lifetime
This adds xml and xmlns namespace bindings. These are defined at https://www.w3.org/TR/xml-names11/#xmlReserved.
2643ebf
to
c05fda1
Compare
This adds xml and xmlns namespace bindings. These are defined at https://www.w3.org/TR/xml-names11/#xmlReserved.
This is a a start of the work needed for #464