Rewrite _create_code() with Structural Pattern Matching (limited to tuples) #496

leogama · 2022-06-02T17:20:43Z

This is a proposal to simplify the _create_code() logic using Structural Pattern Matching. Sadly, it's a Python 3.10+ only feature, but we just need a limited subset of its functionality that can be implemented with a simple class.

Seems to be working across Python versions –I get the same fails for the version combinations as with the current version. However, I'm not sure the logic is 100% equivalent (mainly for lnotab).

Note: match and case are "soft keywords" and can be used freely outside a match statement.

Related to #488 and #495.

…uples)

dill/_shims.py

mmckerns

This is a nice cleanup of the logic, and makes it more maintainable. Have you checked what kind of impact it has on speed? Also, as noted in comments, I'm not convinced that _shims is the right place for the match class. Please address the questions in the comments above.

leogama · 2022-06-20T13:34:59Z

I'm glad you asked about performance, because my first version of the function using pattern matching was performing really poorly as the first 10+ arguments almost always matched (as they are common to almost all patterns and, contrary to the built-in implementation, mine doesn't do anything clever to optimize these cases). I stepped back a little bit the abstraction degree —will push in a couple minutes.

Edit: removed unused features from the match class, they can be added later if needed.

Here is a summary of a microbenchmark of unpickling with dill.loads(obj) done with pyperf on my cheap notebook. Unpickling a function with these changes takes 70% more time in CPython and 15% more in PyPy, which I think is acceptable as these operations have a magnitude order of 10μs.

Version	Empty code object*	Small function (5 LOC)	Median function (30 LOC)
python3.7	1.9	1.5	1.3
python3.8	2.2	1.9	1.7
python3.9	2.2	1.8	1.7
python3.10	2.1	1.8	1.6
python3.11a	2.7	1.9	1.7
pypy3.8	1.5	1.2	1.1

Structural Pattern Matching time divided by procedural time (current implementation). *(def f(): pass).__code__

What I don't get is why the overhead seems to depend on the function size in CPython:

Version	Small func / Empty code	Median func / Empty code
python3.7	1.1	2.0
python3.8	1.4	2.6
python3.9	1.3	2.7
python3.10	1.4	2.9
python3.11a	1.1	2.3
pypy3.8	1.0	1.8

Ratio of time overheads.

mmckerns

Can you profile to see if the speed hit is localized to something, and potentially try to mitigate it a bit? I'm a bit focused on speed here because we aren't gaining any new features... only making maintenance easier. I like this PR, and hopefully we can squeeze a bit more speed out of it.

leogama · 2022-06-26T03:12:33Z

I actually did some experiments trying to optimize it, with few gains, but the code was heading more and more towards the old implementation, with lots of if statements and manipulations of args. Couldn't localize it though. The version with the best performance was still ~60% slower (instead of 70%) than the current one. Even using the actual match statement in Python 3.10 yielded similar results (I didn't repeat the comparison after my first version, but I think it would be slightly worse than this). ~~Maybe the overhead is because of the dictionary operations after the matching...~~

Edit: The overhead is not concentrated in the later dict operations, see comment below. Also, the final version I reached is a compromise between logic clearness and really minor efficiency that can be squeezed out of this.

leogama · 2022-06-26T03:40:37Z

I'm playing with line_profiler here... It isn't shedding much light over the localization issue, it seem to depend on the Python version.

leogama · 2022-06-26T15:37:29Z

New benchmark! I did achieve some extra performance squeezing. Changes:

Merged Py3.7 with Py3.8/3.9/3.10 case;
Added an optimization for the most common case, in which the pickle was created by the same python version of the unpickler;
Got rid of a dictionary comprehension, passing the encoding step to the args creation prior to the CodeType(*args) call.

And here are the time results (with an empty code object):

Version	New (optimized)	Native Python SPM	Small function
python3.7	1.35	-	1.17
python3.8	1.44	-	1.26
python3.9	1.52	-	1.29
python3.10	1.43	1.35	1.30
python3.11b	1.91	1.90	1.43
pypy38	1.11	-	1.06

Time ratio over current implementation

So this is a mean overhead of 53% over the supported CPython versions (against the previous 124%), which is diluted down to 29% for a small function (78% previously). For better maintainability and smaller risk of bugs —Python 12a was just released on Friday 🙄— I think it's a good compromise.

mmckerns

To me, this is still too much of a speed hit for something that will be called very regularly and is purely for maintainability. If there's no way to squeeze out more performance, I'd suggest we close this. I really appreciate the effort here, and I'd like to see you get the speed down further rather than we close it... I'd expect that if you used match from python STL in 3.10+, then you might see some additional performance for 3.10+. Let's say that we don't want to significantly impact the speed of 3.8+ -- so if that means things like rearranging some if statements for incremental gains, then do it.

leogama · 2022-07-02T21:39:02Z

I understand. I'll see if can get any more microseconds out of this. The middle column in my last comment is a variation using the match statement in Python 3.10 and 3.11b instead of the match class, and you can see that the performance cost is equivalent. And what is Python STL? Standard Library? (I'm confused because match is an addition to the language, not to the library.)

mmckerns · 2022-07-02T22:10:30Z

And what is Python STL? Standard Library? (I'm confused because match is an addition to the language, not to the library.)

Ah... I meant using the python match addition to the language. Ok, now I understand the middle column.

leogama · 2022-07-03T01:25:06Z

Edit: put the Python 3.11a as the last case as it's the least likely.

Look, if we don't check the code members' types, the average overhead among supported CPython versions goes down to a minimum of 10% for an empty code object (against 53% with type checks) and of 6% for a small function (was 29% with type checks). Also it's apparently faster than the current implementation in PyPy. ~~I didn't commit this version yet.~~

Version	Empty code obj	Small function
python3.7	1.06	0.97
python3.8	1.08	1.07
python3.9	1.09	1.05
python3.10	1.09	1.12
python3.11b	1.19	1.09
pypy38	0.97	0.97

Ratio between new implementation time and current implementation time.

The only different in behavior will be, if the code object changes again to a format with the same number of members from previous versions but other composition, _create_code() will raise a TypeError from CodeType(*args) instead of a nice UnpicklingError("pattern match for code object failed"). Maybe we can live with that. 🤔

I think this version is faster in general than the native pattern matching (didn't test this time), as there are no variable/keyword bindings in the commonest case. Anyway, if in the future it is necessary to distinguish between code objects of the same size but different types, the framework to do it is there.

leogama added 2 commits June 2, 2022 14:11

rewrite _create_code() with Structural Pattern Matching (limited to t…

c027e39

…uples)

lnotab and linetable: fixup

4d1db4a

leogama marked this pull request as ready for review June 8, 2022 17:17

mmckerns self-requested a review June 8, 2022 19:56

mmckerns added this to the dill-0.3.6 milestone Jun 8, 2022

mmckerns reviewed Jun 18, 2022

View reviewed changes

dill/_shims.py Outdated Show resolved Hide resolved

mmckerns reviewed Jun 18, 2022

View reviewed changes

mmckerns added the refactor label Jun 18, 2022

pattern matching: optimizations

a8ad4e4

leogama force-pushed the matching-backport branch from fc84eb7 to a8ad4e4 Compare June 20, 2022 15:44

leogama requested a review from mmckerns June 21, 2022 17:12

mmckerns reviewed Jun 25, 2022

View reviewed changes

pattern matching: more optimizations

06970b1

leogama added 2 commits June 26, 2022 13:11

_create_code: tests

7db3d24

Merge branch 'master' into matching-backport

1c3778c

leogama requested a review from mmckerns June 26, 2022 16:15

Merge branch 'master' into matching-backport

e171120

mmckerns reviewed Jul 2, 2022

View reviewed changes

leogama added 2 commits July 2, 2022 23:45

match/case: don't check code members' types

15bb42b

Merge branch 'master' into matching-backport

001f376

leogama requested a review from mmckerns July 3, 2022 03:01

mmckerns approved these changes Jul 3, 2022

View reviewed changes

mmckerns merged commit dc35b66 into uqfoundation:master Jul 3, 2022

leogama deleted the matching-backport branch July 3, 2022 12:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite _create_code() with Structural Pattern Matching (limited to tuples) #496

Rewrite _create_code() with Structural Pattern Matching (limited to tuples) #496

leogama commented Jun 2, 2022

mmckerns left a comment •

edited

Loading

leogama commented Jun 20, 2022 •

edited

Loading

mmckerns left a comment •

edited

Loading

leogama commented Jun 26, 2022 •

edited

Loading

leogama commented Jun 26, 2022

leogama commented Jun 26, 2022

mmckerns left a comment •

edited

Loading

leogama commented Jul 2, 2022

mmckerns commented Jul 2, 2022 •

edited

Loading

leogama commented Jul 3, 2022 •

edited

Loading

Rewrite _create_code() with Structural Pattern Matching (limited to tuples) #496

Rewrite _create_code() with Structural Pattern Matching (limited to tuples) #496

Conversation

leogama commented Jun 2, 2022

mmckerns left a comment • edited Loading

Choose a reason for hiding this comment

leogama commented Jun 20, 2022 • edited Loading

mmckerns left a comment • edited Loading

Choose a reason for hiding this comment

leogama commented Jun 26, 2022 • edited Loading

leogama commented Jun 26, 2022

leogama commented Jun 26, 2022

mmckerns left a comment • edited Loading

Choose a reason for hiding this comment

leogama commented Jul 2, 2022

mmckerns commented Jul 2, 2022 • edited Loading

leogama commented Jul 3, 2022 • edited Loading

mmckerns left a comment •

edited

Loading

leogama commented Jun 20, 2022 •

edited

Loading

mmckerns left a comment •

edited

Loading

leogama commented Jun 26, 2022 •

edited

Loading

mmckerns left a comment •

edited

Loading

mmckerns commented Jul 2, 2022 •

edited

Loading

leogama commented Jul 3, 2022 •

edited

Loading