Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explore: alternative CSS selector parsers #2560

Open
flavorjones opened this issue May 31, 2022 · 4 comments
Open

explore: alternative CSS selector parsers #2560

flavorjones opened this issue May 31, 2022 · 4 comments
Milestone

Comments

@flavorjones
Copy link
Member

flavorjones commented May 31, 2022

The CSS selector parser we have is complex, and selector parsing is really a separable concern from Nokogiri proper. It would be nice if we were able to use an existing parser.

(Side note: the generation of XPath from the CSS is a Nokogiri concern, though, since the generated xpath query is often tightly coupled to the version of libxml or the C extension. Perhaps we can spin this off as a separate gem/concern at some point, but it would need to be pluggable to do nokogiri-specific xpath things and I don't feel like that's worth the effort right now.)

Some things to look at that generate ASTs for CSS:

I'd also like to fix some outstanding bugs in the current implementation:

though, note that the behavior changes to fix these bugs probably justify a 2.0 major release, because it's going to break existing apps.

And then I think we can also introduce some new features:

@flavorjones
Copy link
Member Author

I've looked at Crass a bit yesterday and today, but it's not returning a fine-enough-grained AST for selectors; we'd have to use the tokens and implement some sort of parser to make it work.

Looking at syntax_tree-css, it's incomplete but is definitely a well-formed AST for selectors. I've started kicking the tires and making basic improvements to see how far I can take it.

@flavorjones
Copy link
Member Author

flavorjones commented Jun 19, 2024

I've got a branch where the hand-written parser work is progressing, in case anybody wants to follow along: https://github.com/sparklemotion/nokogiri/tree/2560-start-custom-css-parser

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant