-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: fix str.replace('.','') should replace every character #24935
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -421,7 +421,7 @@ def str_endswith(arr, pat, na=np.nan): | |
return _na_map(f, arr, na, dtype=bool) | ||
|
||
|
||
def str_replace(arr, pat, repl, n=-1, case=None, flags=0, regex=True): | ||
def str_replace(arr, pat, repl, n=-1, case=None, flags=0, regex=None): | ||
r""" | ||
Replace occurrences of pattern/regex in the Series/Index with | ||
some other string. Equivalent to :meth:`str.replace` or | ||
|
@@ -452,9 +452,13 @@ def str_replace(arr, pat, repl, n=-1, case=None, flags=0, regex=True): | |
flags : int, default 0 (no flags) | ||
- re module flags, e.g. re.IGNORECASE | ||
- Cannot be set if `pat` is a compiled regex | ||
regex : bool, default True | ||
regex : boolean, default None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. default is still True There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so, you mean we now dont raise a warning anymore when char is a special single character while user doesn't explicitly specify pd.Series(['aa']).str.replace('.', 'b') we will get:
|
||
- If True, assumes the passed-in pattern is a regular expression. | ||
- If False, treats the pattern as a literal string | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is a disconcerting change. You are essentially having a different default on what is being passed here. I would be ok with forcing the passing of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I understand that @jreback it's a quite disruptive change. This change is following the proposal from @TomAugspurger in #24809 (comment) . The purpose is to make this argument more explicit. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@jreback can you clarify what you mean by "forcing"? If we raise when My proposal is the preserve the previous behavior, but warn when a length-1 regex is detected. Then we can get to the documented behavior of |
||
- If `pat` is a single character and `regex` is not specified, `pat` | ||
is interpreted as a string literal. If `pat` is also a regular | ||
expression symbol, a warning is issued that in the future `pat` | ||
will be interpreted as a regex, rather than a literal. | ||
- Cannot be set to False if `pat` is a compiled regex or `repl` is | ||
a callable. | ||
|
||
|
@@ -561,7 +565,7 @@ def str_replace(arr, pat, repl, n=-1, case=None, flags=0, regex=True): | |
# add case flag, if provided | ||
if case is False: | ||
flags |= re.IGNORECASE | ||
if is_compiled_re or len(pat) > 1 or flags or callable(repl): | ||
if is_compiled_re or pat or flags or callable(repl): | ||
n = n if n >= 0 else 0 | ||
compiled = re.compile(pat, flags=flags) | ||
f = lambda x: compiled.sub(repl=repl, string=x, count=n) | ||
|
@@ -574,6 +578,12 @@ def str_replace(arr, pat, repl, n=-1, case=None, flags=0, regex=True): | |
if callable(repl): | ||
raise ValueError("Cannot use a callable replacement when " | ||
"regex=False") | ||
# if regex is default None, and a single special character is given | ||
# in pat, still take it as a literal, and raise the Future warning | ||
if regex is None and len(pat) == 1 and pat in list(r"[\^$.|?*+()]"): | ||
warnings.warn("'{}' is interpreted as a literal in ".format(pat) + | ||
"default, not regex. It will change in the future.", | ||
FutureWarning) | ||
f = lambda x: x.replace(pat, repl, n) | ||
|
||
return _na_map(f, arr) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can leave this here, but need a note on the Deprecation warning change.