-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tags out of order in returned list when using css to specify multiple tags #104
Comments
Yeah, that's indeed unexpected behavior. I will have a look a bit closer this week. @lexborisov is there a way to fix this? It looks like both modest and lexbor are affected.
That would be unfair to take all the credit for this library since most of the hard work is done by @lexborisov. @lexborisov do you accept donations? |
Yeah, it's my fault.
I seriously hadn't considered accepting donations. It doesn't seem to make sense. Not that many people will be donating. |
Sorry, I remember this challenge. A lot of things to do at my day job. I hope to solve it soon. |
Two factors served to completely change the algorithm for searching nodes by selectors: 1. The order in which the nodes were found. Previously, the algorithm found nodes by selectors in its own defined order. This did not match the specification and behavior of modern browsers. For example, the correct order: HTML: <div><p class="x"></p><p id="y">"abc"</p></div> Selectors: .x, div Result: div, p Previous result: p, div. This related rushter/selectolax#104 issue on GitHub. 2. limitation on nesting of pseudo class function. The specification does not limit the nesting of pseudo functions in any way. For example: Selectors: :not(:not(:not(:not(:not( <and 4k times :not()> ))))) Previously, all pseudo function nesting calls were made on the stack. This caused stack overflow in case of large nesting. Now no stack is used for nested functions. Recursions are also absent. This makes the code safer. Also, options have been added to change the search behavior. Added new tests and fuzzer.
Sorry, it took time and a complete rewrite of the algorithm. |
I've deployed a new release with updated lexbor backend. |
When using css selection, I want to grab two different tags (p and h3). When I use the selector like this:
html.css("p,h3")
It selects the appropriate tags but the list gives all p tags first and the h3 tag last.
Example:
I would expect the returned list to give:
[<node p>, <node h3>, <node p>]
Instead it returns:
[<node p>, <node p>, <node h3>]
However, if I use
html.css("*")
it does return them in correct order but I have to loop through and throw out all unneeded nodes.If this is indeed a bug, I'd give it a low priority since using css("*") is an alternative where I can simply loop through and only grab what I'm interested in. I just wasn't sure if this was a bug or expected behavior.
If this is expected behavior when selecting multiple css elements, is there a way to get them in the order they appear in the parent (similar to "*" as the CSS selector)
Also, please provide Patreon or Bitcoin wallet if possible so I can contribute for your time. Thank you for creating such an amazing tool. I use this often since it is lightweight, efficient and easy to use.
The text was updated successfully, but these errors were encountered: