Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invalid character index in findfirst #15723

Closed
Keno opened this issue Mar 31, 2016 · 7 comments
Closed

invalid character index in findfirst #15723

Keno opened this issue Mar 31, 2016 · 7 comments
Assignees
Labels
error handling Handling of exceptions by Julia or the user unicode Related to unicode characters and encodings
Milestone

Comments

@Keno
Copy link
Member

Keno commented Mar 31, 2016

julia> findfirst("⨳(",'(')
ERROR: UnicodeError: invalid character index
 in next(::UTF8String, ::Int64) at ./version.jl:106
 [inlined code] from ./strings/basic.jl:37
 in findnext(::UTF8String, ::Char, ::Int64) at ./array.jl:664
 in findfirst(::UTF8String, ::Char) at ./array.jl:670
 in eval(::Module, ::Any) at ./docs/bootstrap.jl:69
@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 1, 2016

it looks like it's trying to treat the string as an array, but also assumes that arrays are indexed as 1:1:length(array) (ref #15434 (comment))

@JeffBezanson
Copy link
Sponsor Member

Those functions might need an ::AbstractArray on the first argument. Right now they're Any.

@Keno
Copy link
Member Author

Keno commented Apr 1, 2016

There is search for string, so that would be fine.

@nalimilan
Copy link
Member

Restricting the signature sounds like the best solution, and at some point we may want to add a specialized version for AbstractString to unify the API (#5664, #10593).

@nalimilan
Copy link
Member

That said, it looks like it should be easy to generalize to any iterable and index type, returning the state from next as the index. Maybe we should decide that the iteration state must be usable as an index when a type supports indexing? Or create an Indexable trait to which this would apply?

@KristofferC KristofferC added error handling Handling of exceptions by Julia or the user unicode Related to unicode characters and encodings labels Jan 22, 2017
@Keno
Copy link
Member Author

Keno commented Aug 25, 2017

Ran into this again:

julia> findnext("λ\n",'\n',1)
ERROR: UnicodeError: invalid character index
Stacktrace:
 [1] slow_utf8_next(::Ptr{UInt8}, ::UInt8, ::Int64, ::Int64) at ./strings/string.jl:172
 [2] next at ./strings/string.jl:204 [inlined]
 [3] getindex(::String, ::Int64) at ./strings/basic.jl:32
 [4] findnext(::String, ::Char, ::Int64) at ./array.jl:1213

julia> search("λ\n",'\n',1)
3

julia> findnext("a\n",'\n',1)
2

julia> search("a\n",'\n',1)
2

Putting on the 1.0 milestone since restricting the find functions after would be breaking.

@Keno Keno added this to the 1.0 milestone Aug 25, 2017
@StefanKarpinski
Copy link
Sponsor Member

Would potentially be fixed by @JeffBezanson's proposal in #10593.

JeffBezanson added a commit that referenced this issue Sep 5, 2017
Makes findfirst, findlast, findnext, and findprev more generic
by using endof, nextind, and prevind. Also adds a `nextind` method
for arrays and `CartesianIndex`.
JeffBezanson added a commit that referenced this issue Sep 5, 2017
Makes findfirst, findlast, findnext, and findprev more generic
by using endof, nextind, and prevind. Also adds a `nextind` method
for arrays and `CartesianIndex`.
JeffBezanson added a commit that referenced this issue Sep 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error handling Handling of exceptions by Julia or the user unicode Related to unicode characters and encodings
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants