Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parsing noscript content in script-enabled mode #3231

Merged

Conversation

stevecheckoway
Copy link
Contributor

This change adds support for parsing noscript content with the scripting flag enabled. In this case, Gumbo will treat the content of the noscript element as raw text, exactly as the browser would.

What problem is this PR intended to solve?

Closes #3178

Have you included adequate test coverage?

Yes.

Does this change affect the behavior of either the C or the Java implementations?

It adds support for this parsing option to libgumbo only.

This change adds support for parsing noscript content with the scripting
flag enabled. In this case, Gumbo will treat the content of the noscript
element as raw text, exactly as the browser would.
@stevecheckoway
Copy link
Contributor Author

Suggestions on a better option name are especially welcome.

I didn't want to use something like scripting_enabled since that strikes me as misleading.

@stevecheckoway
Copy link
Contributor Author

I just realized that I haven't had the html5lib tests running so this is really not ready for review yet. Gotta figure out what I did wrong here first.

@stevecheckoway stevecheckoway marked this pull request as draft June 12, 2024 19:42
@stevecheckoway stevecheckoway marked this pull request as ready for review June 13, 2024 02:36
@stevecheckoway
Copy link
Contributor Author

It seems that we're now passing the scripting enabled html5lib tree construction tests.

test/html5/test_api.rb Show resolved Hide resolved
test/html5/test_api.rb Outdated Show resolved Hide resolved
test/html5/test_api.rb Show resolved Hide resolved
test/html5/test_tree_construction.rb Show resolved Hide resolved
@flavorjones
Copy link
Member

This PR should also update the docstring in lib/nokogiri/html5.rb.

Let me know if you'd like me to put in any of this work? I'm happy to do as much or as little as you like. ♥

@flavorjones
Copy link
Member

Suggestions on a better option name are especially welcome

The name parse_noscript_as_text is not awful, and does what it says. If I was going to suggest anything, it would be a vaguer description like emulate_browser or pretend_scripting_is_enabled.

Or if we want to be funny, wwtbd ("What Would The Browser Do?")

@stevecheckoway
Copy link
Contributor Author

This PR should also update the docstring in lib/nokogiri/html5.rb.

I did that as well. But I left the existing option name. I'll defer to your experience in API design.

@flavorjones flavorjones force-pushed the script-enabled-noscript branch from 46c2001 to e3f326a Compare June 20, 2024 20:39
Copy link
Member

@flavorjones flavorjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I added a commit to fix up some of the rdoc formatting (which is famously challenging to write correctly without looking at the generated docs in a browser) and a CHANGELOG entry.

@flavorjones flavorjones enabled auto-merge June 20, 2024 20:41
@flavorjones
Copy link
Member

The changelog entry isn't worded perfectly ... I'll fix it in postproduction

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Option to Treat noscript Tags as Script Enabled
2 participants