Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serializing unparented HTML node on JRuby #2895

Merged

Conversation

cbasguti
Copy link
Contributor

@cbasguti cbasguti commented May 29, 2023

What problem is this PR intended to solve?

Closes #2559

This PR aims to solve issue #2559 in the Nokogiri project. After investigating the reported bug, it was identified that a NullPointerException occurs in the isHtmlScript and isHtmlStyle functions due to the inability to evaluate a null object in Java. The proposed solution involves adding a null check for the getParentNode() method's return value and logging an appropriate message if the parent node is null or does not match the expected tag name. A draft PR will be opened to implement these modifications and gather feedback for effectiveness.

Have you included adequate test coverage?

As this is my first contribution to the project, I am uncertain about the appropriate location to include the corresponding test. However, I am committed to ensuring adequate test coverage for this change and will collaborate with the project team to determine the best approach.

Does this change affect the behavior of either the C or the Java implementations?

This change only affects the Java implementation, as the issue of a NullPointerException is specific to that platform. The C implementation does not encounter this problem.

@flavorjones
Copy link
Member

@cbasguti Thank you for opening this PR! I've kicked off CI, I'd like to see what the test suite says before reviewing.

Copy link
Member

@flavorjones flavorjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes I'd like to request:

  1. Can you please add a test for this use case? A good place for this test might be test/xml/test_node.rb right above the describe "#wrap" block.
  2. Can you please follow the style conventions? If you have the astyle utility installed you can automatically do this by running bundle exec rake format
  3. There's no need to emit messages when this happens. We just want to make sure the end result matches what happens in CRuby (which is why the test is important).

If you need help writing the test, a minimal test that would be a good starting point might be:

assert_equal("asdf", Nokogiri::HTML4::Document.parse("<div></div>").create_text_node("asdf").to_s)

@flavorjones flavorjones marked this pull request as ready for review May 30, 2023 15:34
@flavorjones
Copy link
Member

Also, if you didn't see it, I tried to provide some help for initial contributions here: https://nokogiri.org/CONTRIBUTING.html#submitting-pull-requests

Thank you again!

@cbasguti cbasguti marked this pull request as draft June 1, 2023 15:02
@cbasguti
Copy link
Contributor Author

cbasguti commented Jun 2, 2023

Ok, I took another look at the contribution documentation page, included a test following your instructions, and also formatted it as needed. Thank you very much for your patience and I look forward to your comments.

@cbasguti cbasguti marked this pull request as ready for review June 2, 2023 11:43
@flavorjones
Copy link
Member

Don't worry about the one failing test "ci / bsd", it's flaky. This looks all green, will review over the weekend!

Copy link
Member

@flavorjones flavorjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at how these (private) functions are used, I think the missing abstraction is "isCDATA", which should return true if the parent node name is "script" or "style".

Can you merge these two functions into one, isCDATA()? I think you can probably have it return a boolean value something like

return htmlDoc && parentNode != null && (parentNode.getNodeName().equals("style") || parentNode.getNodeName().equals("script"))

ext/java/nokogiri/internals/SaveContextVisitor.java Outdated Show resolved Hide resolved
ext/java/nokogiri/internals/SaveContextVisitor.java Outdated Show resolved Hide resolved
@flavorjones
Copy link
Member

Looks good! I think this will go green, and I'll merge it. I've credited you in the CHANGELOG!

Thank you so much for your contribution!

@flavorjones flavorjones merged commit 76c891c into sparklemotion:main Jun 5, 2023
@flavorjones flavorjones mentioned this pull request Jul 5, 2023
7 tasks
@flavorjones flavorjones added this to the v1.15.x patch releases milestone Jul 5, 2023
@flavorjones
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug] jruby NPE when serializing an unparented HTML node
2 participants