Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re.sub does NOT substitute all the matching patterns when re.IGNORECASE is used #85830

Closed
anitrajpurohit28 mannequin opened this issue Aug 29, 2020 · 2 comments
Closed
Labels
3.8 (EOL) end of life topic-regex type-bug An unexpected behavior, bug, or error

Comments

@anitrajpurohit28
Copy link
Mannequin

anitrajpurohit28 mannequin commented Aug 29, 2020

BPO 41664
Nosy @ezio-melotti

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2020-08-29.22:32:24.365>
created_at = <Date 2020-08-29.21:47:33.268>
labels = ['expert-regex', 'invalid', 'type-bug', '3.8']
title = 're.sub does NOT substitute all the matching patterns when re.IGNORECASE is used'
updated_at = <Date 2020-08-29.22:32:24.364>
user = 'https://bugs.python.org/anitrajpurohit28'

bugs.python.org fields:

activity = <Date 2020-08-29.22:32:24.364>
actor = 'mrabarnett'
assignee = 'none'
closed = True
closed_date = <Date 2020-08-29.22:32:24.365>
closer = 'mrabarnett'
components = ['Regular Expressions']
creation = <Date 2020-08-29.21:47:33.268>
creator = 'anitrajpurohit28'
dependencies = []
files = []
hgrepos = []
issue_num = 41664
keywords = []
message_count = 2.0
messages = ['376083', '376086']
nosy_count = 3.0
nosy_names = ['ezio.melotti', 'mrabarnett', 'anitrajpurohit28']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue41664'
versions = ['Python 3.8']

@anitrajpurohit28
Copy link
Mannequin Author

anitrajpurohit28 mannequin commented Aug 29, 2020

Usage of re flags leads to inconsistent results when

  1. The pattern directly used in re.sub
  2. The pattern is re.compile'd and used

Note 1: Input string is all in the lowercase 'all is fair in love and war'
Note 2: Results are always consistent in case of re.compile'd pattern
=======================================

  1. The pattern directly used in re.sub
    =======================================
>>> import re
>>> re.sub(r'[aeiou]', '#', 'all is fair in love and war')
'#ll #s f##r #n l#v# #nd w#r'
>>> 
>>> re.sub(r'[aeiou]', '#', 'all is fair in love and war', re.IGNORECASE)
'#ll #s fair in love and war'
>>> 
>>> re.sub(r'[aeiou]', '#', 'all is fair in love and war', re.IGNORECASE|re.DOTALL)
'#ll #s f##r #n l#v# #nd w#r'
>>> 
>>> 

=======================================
2. The pattern is re.compile'd and used
=======================================

>>> pattern = re.compile(r'[aeiou]', re.IGNORECASE)
>>> re.sub(pattern, '#', 'all is fair in love and war')
'#ll #s f##r #n l#v# #nd w#r'
>>> 
>>> pattern = re.compile(r'[aeiou]')
>>> re.sub(pattern, '#', 'all is fair in love and war')
'#ll #s f##r #n l#v# #nd w#r'
>>> 
>>> pattern = re.compile(r'[aeiou]', re.IGNORECASE | re.DOTALL)
>>> re.sub(pattern, '#', 'all is fair in love and war')
'#ll #s f##r #n l#v# #nd w#r'

@anitrajpurohit28 anitrajpurohit28 mannequin added 3.8 (EOL) end of life topic-regex type-bug An unexpected behavior, bug, or error labels Aug 29, 2020
@mrabarnett
Copy link
Mannequin

mrabarnett mannequin commented Aug 29, 2020

The 4th argument of re.sub is 'count', not 'flags'.

re.IGNORECASE has the numeric value of 2, so:

    re.sub(r'[aeiou]', '#', 'all is fair in love and war', re.IGNORECASE)

is equivalent to:

    re.sub(r'[aeiou]', '#', 'all is fair in love and war', count=2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 (EOL) end of life topic-regex type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

0 participants