Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add method to replace that takes a function RegexMatch -> String #41051

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

epithet
Copy link
Contributor

@epithet epithet commented Jun 1, 2021

This adds the possibility to perform regex string replacements with an arbitrary function of the matching string and the capture groups as proposed in #36293:

replace("ax ay bx by", r"([ab])([xy])" =>
    RegexReplacer(m -> uppercase(m[1]) * m[2]))
# Ax Ay Bx By

I chose the name RegexReplacer for consistency with the related types Regex and RegexMatch and because I find it easy to remember that replace takes a "replacer". See also Rust's trait regex::Replacer. But this is of course up for debate and other names have been proposed.

I proposed a different API in #24598 which would solve both issues at the same time:

replace("ax ay bx by", r"([ab])([xy])") do m
    uppercase(m[1]) * m[2]
end

But this hasn't gotten any feedback yet. I tried to give some pros and cons for both variants in #36293 and I'm happy with either/both solutions. Update: in light of #40484, I don't think the do-block variant makes sense anymore as an alternative.

TODO:

  • Documentation

Fixes #36293

@epithet epithet force-pushed the replace-capture-groups branch from 4c540ad to 0fa195d Compare June 4, 2021 15:37
@epithet
Copy link
Contributor Author

epithet commented Jun 4, 2021

Still good to have the simpler method for simpler cases:

julia> @time map(s -> replace(s, r"world" => uppercase),
                 repeat(["hello, world"], 10_000_000))
  5.487882 seconds (70.05 M allocations: 3.430 GiB, 25.61% gc time, 0.44% compilation time)
10000000-element Vector{String}:
 "hello, WORLD"
 
julia> @time map(s -> replace(s, r"world" => RegexReplacer(m->uppercase(m.match))),
                 repeat(["hello, world"], 10_000_000))
  8.028345 seconds (100.05 M allocations: 5.517 GiB, 29.23% gc time, 0.35% compilation time)
10000000-element Vector{String}:
 "hello, WORLD"
 

@epithet
Copy link
Contributor Author

epithet commented Jun 4, 2021

#40484 makes the do-block variant even more inconsistent and less useful. So it's at best a convenience add-on, but doesn't make sense as an alternative to me anymore.

@epithet epithet force-pushed the replace-capture-groups branch 2 times, most recently from bca4908 to 37bfd0a Compare June 6, 2021 05:37
@epithet epithet force-pushed the replace-capture-groups branch from 37bfd0a to 9f1c9f9 Compare June 9, 2021 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a "SubstitutionFunction" for replace
1 participant