Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

count(regex, string) computes length(findall(regex, string)) better #32849

Merged
merged 1 commit into from
Aug 10, 2019

Conversation

StefanKarpinski
Copy link
Member

@StefanKarpinski StefanKarpinski commented Aug 9, 2019

Sometimes you just want to count how many times something occurs. I basically just copied the findall method above and counted instead of pushing. It could be refactored to share logic, but that seemed like overkill.

@StefanKarpinski
Copy link
Member Author

StefanKarpinski commented Aug 9, 2019

FreeBSD failure is unrelated (FileWatching).

@StefanKarpinski
Copy link
Member Author

StefanKarpinski commented Aug 9, 2019

Windows failures also unrelated (frippery.org can't be reached).

@StefanKarpinski
Copy link
Member Author

I guess an API concern about this is that count(p, itr) counts items in itr that match p whereas this method counts subsequences of itr which match a regex. Does that matter?

@JeffBezanson
Copy link
Member

Well, we also have methods of findall where the first argument is either a Function or a Regex, so that ship has sailed.

@StefanKarpinski
Copy link
Member Author

Ok, in that case, there's no real reason not to do this.

@StefanKarpinski StefanKarpinski merged commit 70586ce into master Aug 10, 2019
@StefanKarpinski StefanKarpinski deleted the sk/count-regex branch August 10, 2019 14:50
@rfourquet
Copy link
Member

Sorry, didn't have time to review before merge! (On holidays 😎, but will have a look later)

@tkf
Copy link
Member

tkf commented Aug 10, 2019

seemed like overkill.

Alas, if Base had transducers...

using Transducers
using Transducers: @next, complete

matches(t, s, overlap=false) = AdHocFoldable() do rf, acc, _
    i, e = firstindex(s), lastindex(s)
    while true
        r = findnext(t, s, i)
        isnothing(r) && return complete(rf, acc)
        acc = @next(rf, acc, r)
        j = overlap || isempty(r) ? first(r) : last(r)
        j > e && return complete(rf, acc)
        @inbounds i = nextind(s, j)
    end
end

count′(args...) = foldl(right, Count(), matches(args...); init=0)
findall′(args...) = foldl(push!, Map(identity), matches(args...); init=UnitRange{Int}[])
find3(args...) = foldl(push!, Take(3), matches(args...); init=UnitRange{Int}[])

(sorry, cannot help it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants