-
-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] xinclude error message is shadowed by missing DTD warning message. #2562
Comments
@jarl-dk Thanks for opening this issue. I'll take a deeper look shortly. |
OK, digging in, it turns out that the libxml2 function
Note that there's little to distinguish the "warnings" from the "error" that was the cause of the failure, so the code chooses the last error under the assumption that a fatal error would be the final recorded error. In this case, that assumption is busted. So the question really becomes: if we raise one and only one exception, which error do we choose? We could have some more complicated logic to look for an error with the word "ERROR" in it, but I don't like depending on unstructured data for this. I'm going to think about it a bit. |
I think perhaps this is a sign that we should be using the As a user, what would you expect Nokogiri to do in this case? If we raised an exception that included the text of all errors and warnings, would that be helpful and/or sufficient? That would be straightforward to implement and gives the full context to the user ... |
In places (like document parsing) where it's possible for libxml2 to emit multiple warnings and errors, aggregate these libxml2 errors into a single exception so users can see all the problems. Previously, we were grabbing the most recent "error" which might just be a warning and not the fatal error preventing parsing from completing. This was misleading and hid the source of the real problem. Now, if there are multiple errors, they're aggregated into a single Nokogiri::XML::SyntaxError with a message like: Multiple errors encountered: - WARNING: text of warning message - ERROR: text of error message - WARNING: text of second warning message Note that I've also renamed some internal C functions to try to incrementally get consistent with naming. Closes #2562
In places (like document parsing) where it's possible for libxml2 to emit multiple warnings and errors, aggregate these libxml2 errors into a single exception so users can see all the problems. Previously, we were grabbing the most recent "error" which might just be a warning and not the fatal error preventing parsing from completing. This was misleading and hid the source of the real problem. Now, if there are multiple errors, they're aggregated into a single Nokogiri::XML::SyntaxError with a message like: Multiple errors encountered: WARNING: text of warning message ERROR: text of error message WARNING: text of second warning message Note that I've also renamed some internal C functions to try to incrementally get consistent with naming. Closes #2562
In places (like document parsing) where it's possible for libxml2 to emit multiple warnings and errors, aggregate these libxml2 errors into a single exception so users can see all the problems. Previously, we were grabbing the most recent "error" which might just be a warning and not the fatal error preventing parsing from completing. This was misleading and hid the source of the real problem. Now, if there are multiple errors, they're aggregated into a single Nokogiri::XML::SyntaxError with a message like: Multiple errors encountered: WARNING: text of warning message ERROR: text of error message WARNING: text of second warning message Note that I've also renamed some internal C functions to try to incrementally get consistent with naming. Closes #2562
See #3257 for what I think is an adequate cosmetic improvement in how errors are handled in these cases. |
**What problem is this PR intended to solve?** In places (like document parsing) where it's possible for libxml2 to emit multiple warnings and errors, aggregate these libxml2 errors into a single exception so users can see all the problems. Previously, we were grabbing the most recent "error" which might just be a warning and not the fatal error preventing parsing from completing. This was misleading and hid the source of the real problem. Now, if there are multiple errors, they're aggregated into a single Nokogiri::XML::SyntaxError with a message like: Multiple errors encountered: WARNING: text of warning message ERROR: text of error message WARNING: text of second warning message Note that I've also renamed some internal C functions to try to incrementally get consistent with naming. Closes #2562 **Have you included adequate test coverage?** Yes. **Does this change affect the behavior of either the C or the Java implementations?** It's a cosmetic improvement on the behavior of the CRuby XML and HTML4 parser. I have not made this improvement to the JRuby implementation, but it should be easy enough to add if someone wishes to do the work.
High level description
When using
xinclude
on a non-existing file followed by anxinclude
with a DOCTYPE declaration (from a non-existing DTD file) The warning text of the missing DTD file ends up in the error text for thexinclude
of the non-existing file.Reproduce the problem
Create a file named
with_doctype_missing_files.xml
with the following content:Create a file named
xinclude_doctype_sub_1.xml
with the following content:Then run
It will produce something like this:
The exit code is as expected. However the error message is wrong; it is the warning message from the missing DTD-file that turns up in the error message for the missing xinclude flie. Interesting enough it is the warning caused by the DOCTYPE declaration in
xinclude_doctype_sub_1.xml
appearing AFTER thexinclude
of the missing file missing file (that is the actual error), not from warning caused by the DOCTYPE declaration inwith_doctype_missing_files.xml
.Expected behavior
Expected error message:
Environment
Additional context
I believe it is related to this one: #1610
The text was updated successfully, but these errors were encountered: