Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable some repetitions for \A and \Z #5349

Merged

Conversation

NVnavkumar
Copy link
Collaborator

Fixes #4800 (which was also partially fixed by PR#5319 when \Z was finally re-enabled).

This enables using \A and \Z in some repetition sequences to add more full support for those escape sequences in regular expressions on the GPU. This enables:

  • + near \A or \Z
  • {n} or {n,} or {n,m} where n > 0

NOTE:

  • \A* and \A{...} can be transpiled to \A?; however cuDF does not yet support \A?, so this * and ? and {0,} will all still fallback to CPU.

Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
@NVnavkumar NVnavkumar self-assigned this Apr 28, 2022
@sameerz sameerz added the feature request New feature or request label Apr 28, 2022
@sameerz sameerz added this to the Apr 18 - Apr 29 milestone Apr 28, 2022
@NVnavkumar
Copy link
Collaborator Author

build

@NVnavkumar NVnavkumar requested a review from andygrove May 2, 2022 22:25
Comment on lines +865 to +866
case (RegexEscaped('A'), '+') |
(RegexSequence(ListBuffer(RegexEscaped('A'))), '+') =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern is repeated a few times. I wonder if it is worth introducing a utility function that can simply expressions to remove redundant list buffers?

Copy link
Contributor

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left one suggestion but not critical for this PR.

@NVnavkumar NVnavkumar merged commit 1e3a9a3 into NVIDIA:branch-22.06 May 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Enable support for more regular expressions with \A and \Z
3 participants