Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tag RegEx #3025

Closed
wants to merge 4 commits into from
Closed

Fix tag RegEx #3025

wants to merge 4 commits into from

Conversation

MicaelJarniac
Copy link

@MicaelJarniac MicaelJarniac commented Jul 5, 2023

Type of changes

  • Bug fix
  • New feature
  • Documentation / docstrings
  • Tests
  • Other

Checklist

  • I've run the latest black with default args on new code.
  • I've updated CHANGELOG.md and CONTRIBUTORS.md where appropriate.
  • I've added tests for new code.
  • I accept that @willmcgugan may be pedantic in the code review.

Description

Closes #3024

Updated regex pattern to improve color formatting. The tag_contents group was updated to exclude < character ([^<]), preventing the matching of tags that contain other tags inside. Additionally, a *? was used for a non-greedy match, making the group match as little as possible, thus avoiding overreaching matches.

Closes Textualize#3024

Updated regex pattern to improve color formatting. The `tag_contents` group was updated to exclude `<` character (`[^<]`), preventing the matching of tags that contain other tags inside. Additionally, a `*?` was used for a non-greedy match, making the group match as little as possible, thus avoiding overreaching matches.
Added myself.
@MicaelJarniac
Copy link
Author

I'll be trying to write tests that check if this fix is working.

@MicaelJarniac
Copy link
Author

Nevermind. I could not find where this kind of test is.

@MicaelJarniac
Copy link
Author

Before

image

Non-greedy

image

Final

image

@MicaelJarniac
Copy link
Author

Apparently, it's still buggy, only less so.
image

@MicaelJarniac
Copy link
Author

from rich.pretty import pprint

class A:
    def __init__(self, a=None):
        self.a = a

    def __repr__(self) -> str:
        return f"A(a={self.a!r}, b=1.23, c=None, d=(1, 2))"

pprint([A("<"), A(">")])

image

@willmcgugan
Copy link
Collaborator

You're not always going to be able to parse things perfectly. REPRs aren't a formal grammar, and there will always be output which break it.

Happy to accept a PR that improves on it though. One thing to be aware of is performance. You might print a lot of text with Rich, and overly expensive regexes can be a problem.

@MicaelJarniac
Copy link
Author

I'll see if there's an easy fix for this second problem I've encountered. The fix for the first one seems to work alright and is already on this PR.

@MicaelJarniac
Copy link
Author

Well, simply removing the tag_contents group entirely does fix it. I'll see what other side effects it has. The style for repr.tag_contents is just Style(color="default").

@MicaelJarniac
Copy link
Author

I found where the tests for this are, and will try to adapt them.
I'll make this PR a draft for now, the tests need some work and there's this other problem I just discovered today.

@MicaelJarniac MicaelJarniac marked this pull request as draft July 11, 2023 16:14
@MicaelJarniac
Copy link
Author

Found a bug with my implementation regarding recursive tags. This should do the trick, but I'm yet to actually test it:

(?P<tag_start><)(?P<tag_name>[-\w.:|]*)(?P<tag_contents>[^><]+|(?R))*+(?P<tag_end>>)

@MicaelJarniac
Copy link
Author

MicaelJarniac commented Jul 11, 2023

Before

image

After

image

@MicaelJarniac
Copy link
Author

Nevermind, it seems like Python's re doesn't support recursion.

@MicaelJarniac
Copy link
Author

What I'm currently doing to patch it locally is this:

# Fix Rich repr tags
# https://github.com/Textualize/rich/pull/3025
from rich.default_styles import DEFAULT_STYLES
from rich.highlighter import ReprHighlighter
from rich.style import Style

DEFAULT_STYLES["repr.tag_contents"] = Style.null()

ReprHighlighter.highlights[0] = r"(?P<tag_start><)(?P<tag_name>[-\w.:|]*)(?P<tag_contents>[^<]*?)(?P<tag_end>>)"

@willmcgugan
Copy link
Collaborator

Closing, assumed stale. But if you are still working on it, feel free to reopen.

@willmcgugan willmcgugan closed this Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Colors break if there's an < in the repr
2 participants