Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept AbstractChar in count and findall (::Char, ::String) #38675

Merged
merged 22 commits into from
Dec 11, 2020
Merged

Accept AbstractChar in count and findall (::Char, ::String) #38675

merged 22 commits into from
Dec 11, 2020

Conversation

lmiq
Copy link
Contributor

@lmiq lmiq commented Dec 3, 2020

Followup from this thread:

https://discourse.julialang.org/t/method-for-counting-characters-in-string-count-a-batman-vs-count-a-batman/51153

Seems that accepting AbstractChar in this function is enough to solve that problem.

test/regex.jl Outdated Show resolved Hide resolved
base/regex.jl Show resolved Hide resolved
@lmiq lmiq changed the title Accept AbstractChar in method count(t, s::String) Accept AbstractChar in count and findall (::Char, ::String) Dec 3, 2020
test/regex.jl Outdated Show resolved Hide resolved
base/regex.jl Outdated Show resolved Hide resolved
base/regex.jl Outdated Show resolved Hide resolved
@stevengj stevengj added the strings "Strings!" label Dec 3, 2020
@stevengj
Copy link
Member

stevengj commented Dec 3, 2020

Needs an item in NEWS.md

@stevengj stevengj added the needs news A NEWS entry is required for this change label Dec 3, 2020
@bradeneliason
Copy link

Checking for overlap isn't much of an overhead, but thoughts on creating a specialized method for Base.count(t::AbstractChar, s::AbstractString) that's marginally faster?

@StefanKarpinski
Copy link
Member

I would measure if there's anything to be gained first. The compiler is pretty aggressive with constant propagation, so the specialization might already be happening. Even if it doesn't end up specializing at compile time, LLVM should be able to see that the overlap flag is never changed and potentially split the loop into two different versions. Even if that doesn't happen, the branch will be perfectly predictable, so there might be no cost to checking the flag.

@lmiq
Copy link
Contributor Author

lmiq commented Dec 5, 2020

I benchmarked here versions with and without the overlap option, and I could not detect any difference reliably. The fluctuations on the benchmarks, even for long strings (ten sentences or so) are much greater than any possible differences in performance, if there are any.

The code_llvm of both seemed to be the same, in the sense that I could not find signs of a conditional in one case that was not present in the other, although I am not much qualified to understand that code.

@StefanKarpinski
Copy link
Member

That seems definitive enough — no need to special case it.

@StefanKarpinski
Copy link
Member

Btw, you've got some whitespace issues that need to be fixed.

@lmiq
Copy link
Contributor Author

lmiq commented Dec 5, 2020

I think all is fixed now. Thank you for your help. I hope in the future I will be able to contribute to something more substantial and needing less feedback :-). Nice to learn about how collaborations work at this level.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Dec 5, 2020

No problem at all! It's amazing how much fiddling around goes into seemingly simple changes. You have no idea how many times I've started my day thinking "well, this should be simple" only to have it take up the next several weeks 😬

@lmiq
Copy link
Contributor Author

lmiq commented Dec 9, 2020

Apparently if this is to be released in 1.6, it needs a backport label.

@KristofferC
Copy link
Member

New feature, doesn't go into 1.6.

@lmiq
Copy link
Contributor Author

lmiq commented Dec 9, 2020

New feature, doesn't go into 1.6.

The compat entries we wrote above are of the form

!!! compat "Julia 1.6"
      Using a character as the pattern requires at least Julia 1.6.

Is that correct?

@KristofferC
Copy link
Member

No, it should be changed to 1.7.

@lmiq
Copy link
Contributor Author

lmiq commented Dec 9, 2020

I am not sure if @StefanKarpinski wants to comment, as he added the compat entry for the first time. Otherwise I can change that right away.

@stevengj
Copy link
Member

stevengj commented Dec 9, 2020

The old compat entry was added before 1.6 branched off. We are now on 1.7-dev.

@stevengj
Copy link
Member

stevengj commented Dec 9, 2020

Also needs a NEWS.md entry.

Copy link
Contributor Author

@lmiq lmiq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated compat entries

base/regex.jl Outdated Show resolved Hide resolved
base/regex.jl Outdated Show resolved Hide resolved
base/regex.jl Outdated Show resolved Hide resolved
base/regex.jl Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
@stevengj stevengj removed the needs news A NEWS entry is required for this change label Dec 9, 2020
Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
@StefanKarpinski StefanKarpinski merged commit 26b73d6 into JuliaLang:master Dec 11, 2020
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
…g#38675)

This change solves an issue raised in this post, in which searching for a char in a string results in a Method error: https://discourse.julialang.org/t/method-for-counting-characters-in-string-count-a-batman-vs-count-a-batman/51153

Co-authored-by: Stefan Karpinski <stefan@karpinski.org>
Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
strings "Strings!"
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants