Skip to content

Commit

Permalink
Handle escaped characters in consumeSubQuery
Browse files Browse the repository at this point in the history
Fixes #2146
  • Loading branch information
jhy committed Jul 8, 2024
1 parent 970403c commit a0537c7
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 1 deletion.
1 change: 1 addition & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
e.g.: `h1:has(+h2)`). [2137](https://github.com/jhy/jsoup/issues/2137)
* The `:empty` selector incorrectly matched elements that started with a blank text node and were followed by
non-empty nodes, due to an incorrect short-circuit. [2130](https://github.com/jhy/jsoup/issues/2130)
* `Element.cssSelector()` would fail with "Did not find balanced marker" when building a selector for elements that had a `(` or `[` in their class names. And selectors with those characters escaped would not match as expected. [2146](https://github.com/jhy/jsoup/issues/2146)
* Fuzz: a Stack Overflow exception could occur when resolving a crafted `<base href>` URL, in the normalizing regex.
[2165](https://github.com/jhy/jsoup/issues/2165)

Expand Down
5 changes: 4 additions & 1 deletion src/main/java/org/jsoup/select/QueryParser.java
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,10 @@ private String consumeSubQuery() {
sq.append("(").append(tq.chompBalanced('(', ')')).append(")");
else if (tq.matches("["))
sq.append("[").append(tq.chompBalanced('[', ']')).append("]");
else
else if (tq.matches("\\")) { // bounce over escapes
sq.append(tq.consume());
if (!tq.isEmpty()) sq.append(tq.consume());
} else
sq.append(tq.consume());
}
return StringUtil.releaseBuilder(sq);
Expand Down
24 changes: 24 additions & 0 deletions src/test/java/org/jsoup/nodes/ElementTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -2612,6 +2612,30 @@ void prettySerializationRoundTrips(Document.OutputSettings settings) {
assertEquals(element, elements.first());
}

@Test void cssSelectorWithBracket() {
// https://github.com/jhy/jsoup/issues/2146
Document doc = Jsoup.parse("<div class='a[foo]'>One</div><div class='b[bar]'>Two</div>");
Element div = doc.expectFirst("div");
String selector = div.cssSelector();
assertEquals("html > body > div.a\\[foo\\]", selector); // would fail with "Did not find balanced marker", consumeSubquery was not handling escapes

Elements selected = doc.select(selector);
assertEquals(1, selected.size());
assertEquals(selected.first(), div);
}

@Test void cssSelectorUnbalanced() {
// https://github.com/jhy/jsoup/issues/2146
Document doc = Jsoup.parse("<div class='a(foo'>One</div><div class='a-bar'>Two</div>");
Element div = doc.expectFirst("div");
String selector = div.cssSelector();
assertEquals("html > body > div.a\\(foo", selector);

Elements selected = doc.select(selector);
assertEquals(1, selected.size());
assertEquals(selected.first(), div);
}

@Test void orphanSiblings() {
Element el = new Element("div");
assertEquals(0, el.siblingElements().size());
Expand Down

0 comments on commit a0537c7

Please sign in to comment.