-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Config
struct that holds parser configuration and implement #513
#677
Conversation
Old tests for that option that comes from xml-rs are removed
Old tests for that option that comes from xml-rs are removed
Old tests for that option that comes from xml-rs are removed failures (ignored): trim_text_end::true_ Failure is ignored for now because it is hard to fix it in current implementation of a parser, but a new implementation coming soon, where that will be easy
failures: check_end_names::true_::mismatched_tags
Fixed: check_end_names::true_::mismatched_tags failures: dashes_in_comments
`dashes_in_comment` test repeat the `check_comments::true_` tests, so removed
… with access via `.config_mut()` accessor
text_trim_start and text_trim_end have retained their logical order
Codecov Report
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. @@ Coverage Diff @@
## master #677 +/- ##
==========================================
+ Coverage 65.05% 65.16% +0.11%
==========================================
Files 38 38
Lines 17837 17851 +14
==========================================
+ Hits 11604 11633 +29
+ Misses 6233 6218 -15
Flags with carried forward coverage won't be shown. Click here to find out more.
|
|
||
/// Self-closed elements should be reported as one `Empty` event | ||
#[test] | ||
fn false_() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps "enabled" and "disabled" would be better names, to avoid keyword clashes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose those words because if we will have other options that are not booleans, tests would be named as a config value
); | ||
assert_eq!( | ||
reader.read_event().unwrap(), | ||
Event::Comment(BytesText::new(" comment \t\r\n")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any reason to support trimming the text values of "comments"? I cannot immediately think of a reason to do that, but perhaps one exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not, at least nobody request such feature. If such a request appears, we can add a separate option.
Generally speaking, I would delete the current trim options as they simply do not work correctly for text alternating with CDATA / comments / processing instructions, but I suppose that would break many users. I was thinking about renaming current Event
into RawEvent
and DeEvent
to Event
and give users stream of Event
s. The RawEvent
then would be a low-level event which usually not needed by most users. That is very raw thoughts currently, so I decided to not do revolutional changes for now.
tests/reader-config.rs
Outdated
#[test] | ||
fn check_end_names_false() { | ||
let mut reader = Reader::from_str("<root></root \t\r\n>"); | ||
reader.trim_markup_names_in_closing_tags(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is it the reason that we only have this for closing tags, attributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, that this is optimization option. Usually end tags does not contain spaces before >
, so if we will assume that the name ends immediately before the >
, we could save some time. Such optimisation make sense only for the closing tags -- for opening tags we in any case should check if it has attributes and find the actual end of tag name.
@@ -11,153 +11,189 @@ use crate::reader::state::ReaderState; | |||
|
|||
use memchr; | |||
|
|||
macro_rules! configure_methods { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate being able to ditch these macros.
fuzz/fuzz_targets/fuzz_target_1.rs
Outdated
let config = reader.config_mut(); | ||
config.expand_empty_elements = true; | ||
config.trim_text(true); | ||
config.trim_text_end = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I slightly dislike the aesthetics here and inability to chain (because of converting methods to members), but ultimately it's an incredibly minor detail
This PR changes a way how reader is configured. Instead of having builder methods in
Reader
/NsReader
now they provide aconfig()
/config_mut()
methods:config()
method introduces the ability to read parser configuration, that was not be possible until now;config_mut()
method allow to set parser options. You can even to set all options at once or store them (theConfig
struct implementsSerialize
/Deserialize
whenserde-types
feature is active)To ensure, that the behavior was not changed, this PR also introduces a new integration test-suite which tests all possible parser options. That allowed me to find a bug in
trim_text_end
option, but it is hard to fix it today. Because I plan to rewrite parser (that task is actually mostly done) I justignore
that test for now.As a nice bonus, after implementing new tests it become obvious how to implement #513, which was done.
Closes #513