An extra bottom-up recursive extractor #9

TrevorHansen · 2023-09-08T07:35:57Z

Hi,

Before I saw #8 I made a a recursive extractor, too.

This code is much ugglier than #8, but is faster on some problems (for example acyclic graphs). On acyclic graphs it's able to extract in one pass through, without building the dependencies.

Taking the cumulative times for all three extractors on my machine:

Loaded 454 jsons.
...
Cumulative time for bottom-up: 3386ms
Cumulative time for bottom-up-analysis: 2483ms
Cumulative time for bottom-up-recursive: 977ms

bottom-up is the one currently in,
bottom-up-analysis is #8,
bottom-up-recursive is this PR.

Is there benefit in having all three, or should the two new ones be combined?

mwillsey · 2023-10-23T22:35:35Z

Good question! @Bastacyclop what do you think? Offhand, I'm leaning toward fewer, more distinct algorithms, so unless we can characterize that these have different advantages, I'd prefer to have just one.

Bastacyclop · 2023-10-25T09:48:30Z

Running the benchmarks with my plot.py version gives:

Loaded 675 jsons.
---- output/babble -- bottom-up results:
tree        mean: 383.4740
dag         mean: 202.1272
micros      mean: 4099.8324
tree   quantiles:    38.00, 201.00, 368.00, 531.00, 1138.00
dag    quantiles:    31.00, 124.50, 199.00, 266.50, 476.00
micros quantiles: 145.00, 1710.50, 3735.00, 5617.50, 16994.00
---- output/babble -- bottom-up-analysis results:
tree        mean: 383.4740
dag         mean: 202.1618
micros      mean: 9924.3410
tree   quantiles:    38.00, 201.00, 368.00, 531.00, 1138.00
dag    quantiles:    31.00, 124.50, 199.00, 266.50, 476.00
micros quantiles: 293.00, 3834.50, 9014.00, 13213.50, 34602.00
---- output/babble -- bottom-up-recursive results:
tree        mean: 383.4740
dag         mean: 202.1272
micros      mean: 6283.2775
tree   quantiles:    38.00, 201.00, 368.00, 531.00, 1138.00
dag    quantiles:    31.00, 124.50, 199.00, 266.50, 476.00
micros quantiles: 173.00, 2303.50, 5119.00, 8275.50, 25213.00

---- output/egg -- bottom-up results:
tree        mean: 3.7143
dag         mean: 3.2857
micros      mean: 7137.1071
tree   quantiles:    1.00, 1.00, 3.00, 6.00, 13.00
dag    quantiles:    1.00, 1.00, 3.00, 5.00, 13.00
micros quantiles: 12.00, 20.50, 57.00, 463.25, 148309.00
---- output/egg -- bottom-up-analysis results:
tree        mean: 3.7143
dag         mean: 3.2857
micros      mean: 16172.1786
tree   quantiles:    1.00, 1.00, 3.00, 6.00, 13.00
dag    quantiles:    1.00, 1.00, 3.00, 5.00, 13.00
micros quantiles: 19.00, 50.25, 88.00, 861.75, 317778.00
---- output/egg -- bottom-up-recursive results:
tree        mean: 3.7143
dag         mean: 3.2857
micros      mean: 9666.6786
tree   quantiles:    1.00, 1.00, 3.00, 6.00, 13.00
dag    quantiles:    1.00, 1.00, 3.00, 5.00, 13.00
micros quantiles: 13.00, 29.00, 78.50, 693.75, 183555.00

---- output/flexc -- bottom-up results:
tree        mean: 322.0000
dag         mean: 85.0000
micros      mean: 50959.7857
tree   quantiles:    89.00, 92.00, 404.50, 546.00, 546.00
dag    quantiles:    35.00, 37.00, 91.50, 137.00, 137.00
micros quantiles: 30125.00, 33549.75, 54544.50, 58975.50, 102810.00
---- output/flexc -- bottom-up-analysis results:
tree        mean: 322.0000
dag         mean: 84.4286
micros      mean: 43439.8571
tree   quantiles:    89.00, 92.00, 404.50, 546.00, 546.00
dag    quantiles:    35.00, 36.00, 91.50, 136.00, 137.00
micros quantiles: 28394.00, 31866.75, 44750.00, 49550.25, 79446.00
---- output/flexc -- bottom-up-recursive results:
tree        mean: 322.0000
dag         mean: 85.0000
micros      mean: 36553.7857
tree   quantiles:    89.00, 92.00, 404.50, 546.00, 546.00
dag    quantiles:    35.00, 37.00, 91.50, 137.00, 137.00
micros quantiles: 26775.00, 30454.50, 35102.00, 38385.25, 73826.00

---- output/tensat -- bottom-up results:
tree        mean: 1858437671761.2795
dag         mean: 5.9862
micros      mean: 737891.6000
tree   quantiles:    2.69, 4.32, 966.23, 2320915813054.39, 9300713475693.02
dag    quantiles:    0.82, 0.94, 4.41, 8.35, 18.81
micros quantiles: 869.00, 22696.25, 230507.50, 1727811.50, 2385394.00
---- output/tensat -- bottom-up-analysis results:
tree        mean: 1858437671761.2795
dag         mean: 5.9793
micros      mean: 584151.0000
tree   quantiles:    2.69, 4.32, 966.23, 2320915813054.39, 9300713475693.02
dag    quantiles:    0.82, 0.94, 4.41, 8.34, 18.80
micros quantiles: 495.00, 18215.00, 143353.50, 1247708.75, 2311130.00
---- output/tensat -- bottom-up-recursive results:
tree        mean: 1858437671761.2795
dag         mean: 5.9916
micros      mean: 69755.5000
tree   quantiles:    2.69, 4.32, 966.23, 2320915813054.39, 9300713475693.02
dag    quantiles:    0.82, 0.94, 4.41, 8.36, 18.84
micros quantiles: 168.00, 10949.75, 54289.00, 69088.25, 349218.00

tree costs are basically the same, as expected.
bottom up recursive is faster than bottom up analysis for all datasets, in particular for tensat: is this only from skipping building dependencies or is there more going on?

I feel like we should be able to combine the three bottom up versions, with some more experiments and code cleanup. Maybe by (1) doing a first pass without building dependencies; and (2) if dependencies are used, doing a bottom up analysis based on unique queues. The question would be whether using unique queues for the second stage brings performance benefits or not (I think it should: egraphs-good/egg#239). To properly evaluate that I would like to see datasests with more costly to compute, child-dependent cost functions.

Bastacyclop · 2023-10-25T09:50:01Z

PS: one thing to consider is that in egg the dependencies don't need to be computed as they are already stored in the e-graph.

mwillsey · 2023-11-02T22:56:26Z

I would like to consider computation of dependencies (parents) as somewhat negligible, as its only linear and as @Bastacyclop says it's already there in many contexts.

I'd like to preserve the "dumb" bottom up extractor as a base case. Ideally, we could consolidate the "smarter" bottom-up extractors into one (i.e., those that do not aim to do cost sharing, as a possible definition). Thoughts on that?

oflatt · 2023-12-13T18:44:45Z

Before I read this closely I also worked on a bottom-up extractor (#20)
Sorry for duplicated work! I would also be happy with consolidating this, #20, and #8 if possible.

TrevorHansen · 2023-12-14T06:39:33Z

With the recent changes to the bottom-up extractor(#20), there's now only a small time advantage for the extractor that I proposed introducing (Note these times differ from before because extra problems have been added). Currently:
Cumulative time for faster-bottom-up: 2060ms [The one in #20]
Cumulative time for bottom-up-recursive: 1533ms [The one in this PR].
Cumulative time for bottom-up: 4471ms

Meaning that the extractor in this PR is only about 25% faster than the others, but is much uglier.

However, there are some tweaks to the "faster-bottom-up" extractor which brings down its runtime to almost the same as the extractor in this PR:

###################################################
faster-bottom-up vs faster-bottom-up-old


extractors: ['faster-bottom-up', 'faster-bottom-up-old']
cumulative time for faster-bottom-up: 1649ms
cumulative time for faster-bottom-up-old: 2060ms
cumulative tree cost for faster-bottom-up: 18584377265237
cumulative tree cost for faster-bottom-up-old: 18584377265237
cumulative dag cost for faster-bottom-up: 78037
cumulative dag cost for faster-bottom-up-old: 78037
Cumulative time for faster-bottom-up: 1649ms
Cumulative time for faster-bottom-up-old: 2060ms
faster-bottom-up / faster-bottom-up-old
geo mean
tree: 1.0000
dag: 1.0000
micros: 0.8184
quantiles
tree:   1.0000, 1.0000, 1.0000, 1.0000, 1.0000
dag:    1.0000, 1.0000, 1.0000, 1.0000, 1.0000
micros: 0.3611, 0.7825, 0.8227, 0.8611, 1.8333

So I've changed this PR to now just introduce some small speedups to the faster-bottom-up extractor, as well as fixing up attribution to @Bastacyclop.

oflatt

Great work! Thanks for cleaning this up and summarizing the results.

Bastacyclop · 2023-12-15T10:45:25Z

👍

TrevorHansen added 2 commits September 8, 2023 17:17

An extra bottom up recursive extractor

c068f40

fix spelling and dupe

74d698f

TrevorHansen force-pushed the om branch from 00be796 to 74d698f Compare September 8, 2023 13:57

TrevorHansen force-pushed the om branch from d33bd6a to 74d698f Compare December 14, 2023 04:10

TrevorHansen added 2 commits December 14, 2023 17:33

Merge remote-tracking branch 'main/main' into om

bbcf21e

Small speedups to faster-bottom-up extractor

6eaf8f8

oflatt approved these changes Dec 14, 2023

View reviewed changes

oflatt merged commit 8b33f88 into egraphs-good:main Dec 14, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An extra bottom-up recursive extractor #9

An extra bottom-up recursive extractor #9

TrevorHansen commented Sep 8, 2023 •

edited

Loading

mwillsey commented Oct 23, 2023

Bastacyclop commented Oct 25, 2023

Bastacyclop commented Oct 25, 2023

mwillsey commented Nov 2, 2023

oflatt commented Dec 13, 2023

TrevorHansen commented Dec 14, 2023

oflatt left a comment

Bastacyclop commented Dec 15, 2023

An extra bottom-up recursive extractor #9

An extra bottom-up recursive extractor #9

Conversation

TrevorHansen commented Sep 8, 2023 • edited Loading

mwillsey commented Oct 23, 2023

Bastacyclop commented Oct 25, 2023

Bastacyclop commented Oct 25, 2023

mwillsey commented Nov 2, 2023

oflatt commented Dec 13, 2023

TrevorHansen commented Dec 14, 2023

oflatt left a comment

Choose a reason for hiding this comment

Bastacyclop commented Dec 15, 2023

TrevorHansen commented Sep 8, 2023 •

edited

Loading