Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greedy bottom up alternative based on egraph analysis algorithm #8

Closed

Conversation

Bastacyclop
Copy link
Contributor

Hi All,

This PR provides an alternative greedy bottom up extraction implementation -- BottomUpAnalysisExtractor -- based on the egraph analysis algorithm that uses a pending queue. This is basically the algorithm that I use in my https://github.com/Bastacyclop/egg-sketches library.

Based on #7

This is the (filtered) output of make:

Loaded 675 jsons.
---- output/babble -- bottom-up results:
dag         mean: 202.1272
micros      mean: 2821.5780
dag    quantiles:    31.00, 124.50, 199.00, 266.50, 476.00
micros quantiles: 91.00, 1218.50, 2385.00, 3782.50, 14182.00
---- output/babble -- bottom-up-analysis results:
dag         mean: 202.1561
micros      mean: 6412.6012
dag    quantiles:    31.00, 124.50, 199.00, 266.50, 476.00
micros quantiles: 177.00, 2555.50, 5514.00, 8845.50, 28623.00
---- output/egg -- bottom-up results:
dag         mean: 3.2857
micros      mean: 6676.3929
dag    quantiles:    1.00, 1.00, 3.00, 5.00, 13.00
micros quantiles: 12.00, 26.25, 54.00, 507.75, 146008.00
---- output/egg -- bottom-up-analysis results:
dag         mean: 3.2857
micros      mean: 12616.2500
dag    quantiles:    1.00, 1.00, 3.00, 5.00, 13.00
micros quantiles: 17.00, 33.00, 77.00, 681.25, 263692.00
---- output/flexc -- bottom-up results:
dag         mean: 85.0000
micros      mean: 63664.1429
dag    quantiles:    35.00, 37.00, 91.50, 137.00, 137.00
micros quantiles: 27256.00, 28729.00, 59139.00, 99794.25, 109424.00
---- output/flexc -- bottom-up-analysis results:
dag         mean: 84.6429
micros      mean: 35691.5000
dag    quantiles:    35.00, 36.00, 91.50, 137.00, 137.00
micros quantiles: 20051.00, 21609.50, 40668.50, 44200.25, 68863.00
---- output/tensat -- bottom-up results:
dag         mean: 5.9862
micros      mean: 797192.6000
dag    quantiles:    0.82, 0.94, 4.41, 8.35, 18.81
micros quantiles: 848.00, 22166.75, 189955.50, 1953350.75, 2280621.00
---- output/tensat -- bottom-up-analysis results:
dag         mean: 5.9760
micros      mean: 644572.2000
dag    quantiles:    0.82, 0.94, 4.41, 8.34, 18.76
micros quantiles: 551.00, 13178.25, 124706.00, 1232582.25, 2452544.00

Overall we can see that the more complex bottom-up-analysis is slower on the easiest datasets 'babble' and 'egg', but also faster on the harder datasets 'flexc' and 'tensat'. I expect bottom-up-analysis to perform even better on datasets with more costly to compute, child-dependent cost functions, although extraction-gym does not support such datasets yet.

The dag costs also vary slightly due to different nodes being chosen -- I assume that the tree costs are the same but could double check.

@oflatt
Copy link
Member

oflatt commented Dec 13, 2023

How does this compare to #20? Are they the same idea?

@Bastacyclop
Copy link
Contributor Author

Hi @oflatt, they are the same idea and the code is almost the same as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants