Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unions on multiple bitmaps at a time #58

Closed
wants to merge 10 commits into from

Conversation

Kerollmops
Copy link
Member

@Kerollmops Kerollmops commented Jun 3, 2020

I would like to introduce methods to do operations on multiple bitmaps at a time, this can greatly improve performances. This PR only introduce the union operation.

In this PR I introduce the RoaringBitmap::union_of method which take a Iterator of Bitmaps and modify returns a new bitmap. Internally it uses a BinaryHeap to always find the lowest containers of the operand bitmaps.

I am not sure of the function signature, the other operations are named union_with and take a mutable self ref. The function signature I use here would be invalid for difference and symmetric difference, as the operations must be done on a base bitmap. I changed it to become an in-place operation. And I change it again to be kind of a constructor.

Related to #57.

@Kerollmops Kerollmops force-pushed the set-multi-operation branch from 65e4a9f to 713eb1a Compare June 3, 2020 10:00
@Kerollmops Kerollmops force-pushed the set-multi-operation branch from 713eb1a to 60138bb Compare June 3, 2020 10:08
@Kerollmops
Copy link
Member Author

Kerollmops commented Jun 3, 2020

I just added some benchmarks and the results are not so great, probably because bitmaps lengths are not that big.

union_with_multi        time:   [2.1961 us 2.2020 us 2.2085 us]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

union_with_multi by hand
                        time:   [1.6332 us 1.6413 us 1.6547 us]
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe

@Kerollmops
Copy link
Member Author

Kerollmops commented Jun 3, 2020

Even with more bitmaps the banchmarks results are not good at all.

union_with_multi        time:   [206.96 us 207.68 us 208.40 us]
                        change: [-0.4573% -0.0425% +0.3423%] (p = 0.84 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild

union_with_multi by hand
                        time:   [60.290 us 60.506 us 60.778 us]
                        change: [-8.8042% -6.5906% -4.5629%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  6 (6.00%) high mild
  6 (6.00%) high severe

In my understanding the advantage we have with the union_with_multi method is that it will only do unions on containers that are involved, skipping containers that are alone and do not interfere with others. Do I need to design tests for this behavior?


EDIT I read again what you said in your comment and yeah, I need to extract the stores from the containers and don't do the operation directly on the containers (because it calls ensure_correct_store after each operation, I believe?).

@Nemo157
Copy link
Collaborator

Nemo157 commented Jun 3, 2020

I think a signature something like this would make sense:

impl RoaringBitmap {
  pub fn union_of(bitmaps: impl IntoIterator<Item = &RoaringBitmap>) -> Self
}

there's no reuse of existing storage so having a self instance isn't really necessary.

I think there's likely a way to make Muple more efficient, I'll have to take a look at it in a bit more detail.

@Kerollmops
Copy link
Member Author

Kerollmops commented Jun 3, 2020

I updated the algorithm to do the operations directly on the Stores and not the containers, benchmarks results didn't really changed so I decided to check performances in detail with fleamegraph.

Obviously performances are impacted by the Vec returned by the Muple Iterator implementation. If we use the standard iterator we can't return a view into an internal buffer, because the Item is not bounded to the appropriate lifetime. So one solution could be to use a custom next method that can return a view into an internal buffer.

Capture d’écran 2020-06-03 à 15 34 42

@Kerollmops
Copy link
Member Author

Kerollmops commented Jun 3, 2020

Ok, so I achieve better performances by using a custom next method and reusing the Vec at every iteration.

union_of                time:   [103.60 us 104.40 us 105.30 us]
                        change: [-3.7481% -2.6579% -1.5749%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
  3 (3.00%) high mild
  12 (12.00%) high severe

union_of by hand        time:   [84.114 us 84.403 us 84.660 us]
                        change: [+0.4501% +0.8097% +1.1816%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Capture d’écran 2020-06-03 à 15 59 36

@Kerollmops Kerollmops force-pushed the set-multi-operation branch from 630525b to 7bab470 Compare June 3, 2020 13:58
@lemire
Copy link
Member

lemire commented Sep 10, 2020

I recommend referring back to the Java approach which has been emulated in C and Go...

https://github.com/RoaringBitmap/RoaringBitmap/blob/master/RoaringBitmap/src/main/java/org/roaringbitmap/FastAggregation.java#L433

The code is like so...

    RoaringBitmap answer = new RoaringBitmap();
    for (int k = 0; k < bitmaps.length; ++k) {
      answer.naivelazyor(bitmaps[k]);
    }
    answer.repairAfterLazy();
    return answer;

In turn naivelazyor calls lazyIOR on the containers when doing intersections.

What this does, roughly, is that...

  1. When computing the union of two arrays, it will eagerly switch to creating a bitmap container, even if that would not be normally warranted.

  2. When doing computations over bitmap containers, it will not track the cardinality.

Then "repairAfterLazy" goes through the code and checks the bitmap containers and possibly convert them back to arrays.

The intuition is as follows... the union between two bitmap containers, or between an array container and a bitmap container, or even between two array container when the output is a bitmap container, can be done efficiently (CPU wise). And if you have many Roaring bitmaps to begin with, you are likely to end up with bitmap containers in the end, so let us jump straight ahead and create them early in the process.

Empirically, this works well quite often.

@@ -129,6 +129,31 @@ fn union_with(c: &mut Criterion) {
});
}

fn union_of(c: &mut Criterion) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that it measures the time it takes to aggregate a few deterministic tiny sets... Tiny sets in a branchy setting lead to incorrect benchmarks because of branch prediction... see https://www.infoq.com/articles/making-code-faster-taming-branches/

I suggest using realistic data sets. See for example https://github.com/RoaringBitmap/RoaringBitmap/tree/master/real-roaring-dataset/src/main/resources/real-roaring-dataset

@saik0 saik0 mentioned this pull request Jan 12, 2022
3 tasks
@bors bors bot closed this in 2828be5 Aug 30, 2022
not-jan pushed a commit to not-jan/roaring-rs that referenced this pull request Aug 31, 2022
223: Implements multioperation for the bitmaps and tree maps r=Kerollmops a=irevoire

Fixes RoaringBitmap#57, closes RoaringBitmap#58, closes RoaringBitmap#109, closes RoaringBitmap#139, and closes RoaringBitmap#219.

There is a lot of performance improvement, but here is a before / after on the operations that were the faster currently (when we can do assign between owned bitmaps).

## And
```
group                                                    after                                  before
-----                                                    -----                                  ------
Successive And/Multi And Owned/census-income             1.00     14.6±0.25µs        ? ?/sec    15.42   224.9±0.76µs        ? ?/sec
Successive And/Multi And Owned/census-income_srt         1.00     14.2±0.25µs        ? ?/sec    3.98     56.4±8.22µs        ? ?/sec
Successive And/Multi And Owned/census1881                1.00     20.7±0.33µs        ? ?/sec    37.18   770.1±1.62µs        ? ?/sec
Successive And/Multi And Owned/census1881_srt            1.00     25.8±1.29µs        ? ?/sec    1.12     28.8±0.09µs        ? ?/sec
Successive And/Multi And Owned/weather_sept_85           1.00     60.7±2.48µs        ? ?/sec    2.15    130.2±2.96µs        ? ?/sec
Successive And/Multi And Owned/weather_sept_85_srt       1.00     48.3±2.21µs        ? ?/sec    2.32    112.2±1.07µs        ? ?/sec
Successive And/Multi And Owned/wikileaks-noquotes        1.00     24.4±0.50µs        ? ?/sec    2.73     66.6±0.27µs        ? ?/sec
Successive And/Multi And Owned/wikileaks-noquotes_srt    1.00     20.3±0.58µs        ? ?/sec    1.09     22.0±0.30µs        ? ?/sec
```

## Or
```
group                                                    after                                  before
-----                                                    -----                                  ------
Successive Or/Multi Or Owned/census-income               1.00    629.3±4.46µs        ? ?/sec    2.29  1441.4±41.36µs        ? ?/sec
Successive Or/Multi Or Owned/census-income_srt           1.00    582.5±1.81µs        ? ?/sec    1.61    937.8±4.03µs        ? ?/sec
Successive Or/Multi Or Owned/census1881                  1.00   1143.4±4.55µs        ? ?/sec    3.48      4.0±0.07ms        ? ?/sec
Successive Or/Multi Or Owned/census1881_srt              1.00    743.4±4.40µs        ? ?/sec    3.49      2.6±0.02ms        ? ?/sec
Successive Or/Multi Or Owned/weather_sept_85             1.00      2.9±0.02ms        ? ?/sec    1.06      3.1±0.01ms        ? ?/sec
Successive Or/Multi Or Owned/weather_sept_85_srt         1.00   1344.5±7.80µs        ? ?/sec    1.06  1426.5±38.08µs        ? ?/sec
Successive Or/Multi Or Owned/wikileaks-noquotes          1.00    476.3±4.43µs        ? ?/sec    5.27      2.5±0.01ms        ? ?/sec
Successive Or/Multi Or Owned/wikileaks-noquotes_srt      1.00    259.4±3.90µs        ? ?/sec    7.17   1860.0±3.30µs        ? ?/sec
```

Co-authored-by: saik0 <github@saik0.net>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: Tamo <irevoire@protonmail.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants