-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(rust,python): add new str.find
expression, returning the index of a regex pattern or literal substring
#13561
Conversation
d4e57e0
to
26645be
Compare
str.find
expression, returning the index of a regex pattern or literal substringstr.find
expression, returning the index of a regex pattern or literal substring
26645be
to
77198e0
Compare
nice!! Returning those sentinal values is very dangenous because it hides errors (if you are not 100% familiar with the function you might miss is because there is no Error/None)! Rust (using Option/Result) and polars handle this generally very good! Or is there anything I miss? |
If you do that then you can't distinguish between not finding the pattern in a valid string and applying For example, calling "find" against a string in Python also returns "abc".find("x")
-1 SQL's |
ah I see, thanks for explaining 😉 |
I'm not completely convinced that returning -1 when the match is not found is superior to returning null. The distinguishing argument is kind of moot since you can just use |
True enough; it's not a hill I'd die on - there's also an @ritchie46, want to tie-break this one? There are pros/cons to both (I like distinguishing between "not found" and "applied to null" at a glance), but @orlp's point about combining with offsets is quite compelling 🤔 |
@alexander-beedie Another argument in favour of null is if you try to compute any statistics on the found index, e.g. 'average text length before the first link' with |
I feel myself being persuaded 🤣 |
Even more disastrous is the minimum, e.g. trying to find the webpage with the shortest header: |
The tipping point has been reached; I'm changing it 😆 (For bonus points the result type can be downsized from an i64 to a u32 now it doesn't have to handle a negative value). |
…he index of a regex pattern or literal substring
42ebab6
to
35cf169
Compare
Done; further streamlined a few things while I was at it... |
3be613d
to
850865c
Compare
Implements expressified
str.find
string functionality forExpr
andSeries
, returning the index/position of a given pattern (regex or literal) in the underlying string. Returns(updated)-1
None
if the pattern is not found.(Also closes #13552 by adding a note about requiring a capture group in the given regex in the
extract
andextract_groups
docstrings).Example