Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance API of captures() to enable retrieval of ALL groups at once, as a dictionary #86

Closed
mrabarnett opened this issue Jan 23, 2013 · 5 comments
Labels
enhancement New feature or request trivial

Comments

@mrabarnett
Copy link
Owner

Original report by Marcin Wojnarski (Bitbucket: mwojnars, GitHub: mwojnars).


Hi,

For non-repeated groups, one can use match.groupdict() to retrieve a dictionary of ALL groups and their values, including un-matched groups. But there is no equivalent for repeated groups: match.captures() only returns values for groups given explicitly in arguments, while groupdict() doesn't include multiple values.

I suggest either:

  1. Change API of captures() so that captures() (no args) returns a dictionary of ALL groups, not just group 0 - this would be the most convenient and intuitive, but would break existing code if somebody relies on this feature.

  2. Add a boolean argument to captures(), say "all", equal False by default, to let the client indicate that a full dictionary is expected.

  3. Add new method, say capturesdict() to return dict of all groups.

Thanks
Marcin

What version of the product are you using? On what operating system?

0.1.20130120
Linux, Python 2.7.2

@mrabarnett
Copy link
Owner Author

Original comment by Anonymous.


Should the dict behave like this?

capturesdict = {}
for name in m.groupdict().keys():
    capturesdict[name] = m.captures(name)

What's your usecase? Could you provide some examples of the suggested feature?

@mrabarnett
Copy link
Owner Author

Original comment by Marcin Wojnarski (Bitbucket: mwojnars, GitHub: mwojnars).


Yes, it should behave in this way.

Usecase: web scraping, extraction of many different values from a complex html page in one go (for example, profile page of a product, with different properties listed in a fixed layout) - after applying a regex the next step is to take *all* extracted data as a dict, not one by one.

@mrabarnett
Copy link
Owner Author

Original comment by Anonymous.


Could you provide some simple test cases?

I think it'll be called 'capturesdict'.

@mrabarnett
Copy link
Owner Author

Original comment by Anonymous.


I've added a 'capturesdict' method to match objects in regex 0.1.20130124.

@mrabarnett
Copy link
Owner Author

Original comment by Marcin Wojnarski (Bitbucket: mwojnars, GitHub: mwojnars).


Great, thanks for all the changes and for very useful library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request trivial
Projects
None yet
Development

No branches or pull requests

1 participant