implemented SparseIntSet #533

louisponet · 2019-09-28T21:22:29Z

I've implemented the SparseIntSet datastructure as was talked about in #532
I based the tests and functionality on that from IntSet, but I wasn't sure how to do symdiff, and also I'm not sure if it even makes sense in this case.

I also added some documentation.

Let me know if there are any things that need to be improved!

Cheers

codecov · 2019-09-28T22:22:29Z

Codecov Report

Merging #533 into master will increase coverage by 0.45%.
The diff coverage is 96.84%.

@@            Coverage Diff             @@
##           master     #533      +/-   ##
==========================================
+ Coverage   87.36%   87.81%   +0.45%     
==========================================
  Files          31       32       +1     
  Lines        1978     2076      +98     
==========================================
+ Hits         1728     1823      +95     
- Misses        250      253       +3

Impacted Files	Coverage Δ
src/sparse_int_set.jl	`96.84% <96.84%> (ø)`
src/queue.jl	`100% <0%> (ø)`	⬆️
src/stack.jl	`100% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 46f87f1...e002e38. Read the comment docs.

oxinabox

Very exciting

Looks pretty good.
I've just done a surface lever review.
I need to take another round at it and look up logic after these changes are in

src/sparse_int_set.jl

src/DataStructures.jl

docs/src/sparse_int_set.md

oxinabox

I have now finished reviewing the code as it is.
I will review the tests later

src/sparse_int_set.jl

Co-Authored-By: Lyndon White <oxinabox@ucc.asn.au>

louisponet · 2019-10-01T20:11:42Z

Okay I think I implemented all your suggestions and made it mutable again. I also made cleaned up cleanup! adding an elseif any(iszero, s.counters) just so the cleanup only actually happens when needed.

oxinabox · 2019-10-01T20:16:57Z

src/sparse_int_set.jl

+    if pageid > length(s.reverse)
+        diff = pageid - length(s.reverse)
+
+        resize!(s.reverse, pageid - 1)


Probably better not to do that? push! will resize for you.
And it has heuristics to better handle the resizing

Not sure I understand what you mean, do you mean to move assure! into push! I guess that's ok since I don't think it's used anywhere else

no, I mean why are we doing resize! then push! ?

Before this was to generate the undefs but now indeed i'm just using push! after sizehint!, I don't know if that's optimal

it is, basically,
Marginally better is resize! + assignment to index
but you missout on the heuristics of how push! will do extra resizing under the hood when it has to grow (I think sizehint! actaully might also block those since it iwill grow early)
and that extra reizinging under the hood gives speedup between seperate calls

did a quick benchmark with doing resize! followed by assignment to index, difference is basically swallowed by noise on the benchmark it seems.

src/sparse_int_set.jl

oxinabox · 2019-10-01T21:03:29Z

I've been thinking about this a bit.
I think cleanup should replace the vectors it wants to remove with Vector{Int}() a zero length vector. We can even define just one of those. Call it const NULL_PAGE=Vector{Int}()
Then we have assure! check ===NULL_PAGE instead of checking isassigned.

If we make a function _pop_and_pageid that does a pop and also returns both the poped value,
and the pageid it came from,
then cleanup! would not have to search the whole array to decide what to clean,
it would just check if that counter is zero,
and if so would replace that page with the NULL_PAGE
and if we have made it that cheap,
then maybe we don't need to have pop! and dirty_pop! and cleanup!
since we can just do that check of if the counter is equal to zero
at every removal,
and then occationally it will be, but that operation will be cheap as well.

We would never get to resize reverse down, but I think that is ok, generally we can't do that very often anyway, since it would need everything removed to come from the far end.

What do you think?

louisponet · 2019-10-01T22:15:37Z

That sounds reasonable to me, I find the whole undef thing a little fuzzy to begin with. I guess the difference in memory usage for undef vs a Vector{Int}() is not huge anyway. I'll have a go at that

oxinabox · 2019-10-01T22:43:18Z

I guess the difference in memory usage for undef vs a Vector{Int}() is not huge anyway

It should be nothing at all. (baring the single allocation for the global constant NULL_PAGE)
Because at the lowest level undef should be a pointer to 0x0000_000...,
and pointer to a constant will be some other pointer of same size

docs/src/sparse_int_set.md

src/sparse_int_set.jl

test/test_sparse_int_set.jl

Co-Authored-By: Lyndon White <oxinabox@ucc.asn.au>

louisponet · 2019-10-02T11:13:15Z

About the iteration through zips (my main usecase, like in the benchmark), is there any way to speed this up? I guess not really because it's mainly the random access pattern taking up most time rather than the logic itself?

I also ran into a strange inference issue on id and tids when I didn't have the id_tids function, and just did it inside the iterate function. I don't get how inference could fail since literally everything is Ints.

oxinabox · 2019-10-02T11:31:51Z

I will look into it in 20 hours time

src/sparse_int_set.jl

oxinabox · 2019-10-03T10:36:11Z

I think with the last few comments resolve we will be able to merge this.
Also can you bump the version number in the Project.toml
so I can tag a release straight after merging?

https://white.ucc.asn.au/2019/09/28/Continuous-Delivery-For-Julia-Packages.html

Project.toml

oxinabox · 2019-10-03T13:26:50Z

src/sparse_int_set.jl

+        return nothing
+    end
+    id, tids = id_tids(it, state)
+    il = length(it)


If this was done at the start of the function it could also be used in the first if

Also why il and not something more descriptive, like it_len or even iterator_length

yes, sorry I was too fast in doing this my bad

Co-Authored-By: Lyndon White <oxinabox@ucc.asn.au>

louisponet · 2019-10-04T10:04:56Z

Great! thanks for all your help with this!

oxinabox · 2019-10-04T10:27:30Z

no problem, thanks for your contribution😀

louisponet added 2 commits September 28, 2019 23:18

implemented SparseIntSet

9efb335

added tests for coveralls, removed unnecessary added to README

c38959c

oxinabox requested changes Sep 29, 2019

View reviewed changes

louisponet added 6 commits September 29, 2019 11:54

implemented comments, not immutable yet

4fb0485

implemented SparseIntSet

4050d46

added tests for coveralls, removed unnecessary added to README

070b4f1

implemented comments, not immutable yet

e228fad

Merge remote-tracking branch 'louisponet/SparseIntSet' into SparseIntSet

61bac31

added SparseIntSet benchmarks

6f8555e

oxinabox reviewed Sep 29, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Sep 29, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Sep 29, 2019

View reviewed changes

src/DataStructures.jl Outdated Show resolved Hide resolved

oxinabox reviewed Sep 29, 2019

View reviewed changes

docs/src/sparse_int_set.md Outdated Show resolved Hide resolved

louisponet added 3 commits September 29, 2019 14:35

code cleanup, assure! comment, removed current_id

8254a23

fixed test

23ddc6c

made SparseIntSet immutable

f248297

milesfrain mentioned this pull request Sep 29, 2019

Split up the package? #310

Open

louisponet added 2 commits September 30, 2019 22:47

added less worst case bench

88fa2d8

Added auto cleanup! on vanilla pop!, dirty_pop! is without cleanup.

77963a8

oxinabox requested changes Oct 1, 2019

View reviewed changes

louisponet and others added 3 commits October 1, 2019 21:41

Apply suggestions from code review

aec600d

Co-Authored-By: Lyndon White <oxinabox@ucc.asn.au>

mutable + cleanup!

e9377eb

only do cleanup when there is actually a zero counter

43e436b

oxinabox reviewed Oct 1, 2019

View reviewed changes

src/sparse_int_set.jl Show resolved Hide resolved

oxinabox reviewed Oct 2, 2019

View reviewed changes

docs/src/sparse_int_set.md Outdated Show resolved Hide resolved

oxinabox reviewed Oct 2, 2019

View reviewed changes

src/sparse_int_set.jl Show resolved Hide resolved

oxinabox reviewed Oct 2, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Oct 2, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Oct 2, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Oct 2, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Oct 2, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Oct 2, 2019

View reviewed changes

test/test_sparse_int_set.jl Show resolved Hide resolved

oxinabox reviewed Oct 2, 2019

View reviewed changes

test/test_sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Oct 2, 2019

View reviewed changes

test/test_sparse_int_set.jl Outdated Show resolved Hide resolved

louisponet and others added 2 commits October 2, 2019 12:08

Apply suggestions from code review

29aaefa

Co-Authored-By: Lyndon White <oxinabox@ucc.asn.au>

corrected copy, in, code cleanup, removed complement

9770b81

oxinabox reviewed Oct 3, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

cleaned up imports

37e8af7

oxinabox reviewed Oct 3, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Oct 3, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Oct 3, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

oxinabox reviewed Oct 3, 2019

View reviewed changes

src/sparse_int_set.jl Outdated Show resolved Hide resolved

immutable zip iterator, semver bump, removed entity_id

3b55ec9

oxinabox reviewed Oct 3, 2019

View reviewed changes

Project.toml Outdated Show resolved Hide resolved

oxinabox reviewed Oct 3, 2019

View reviewed changes

louisponet and others added 2 commits October 4, 2019 11:21

Update Project.toml

74ecdbf

Co-Authored-By: Lyndon White <oxinabox@ucc.asn.au>

length better length in iterator

e002e38

oxinabox approved these changes Oct 4, 2019

View reviewed changes

oxinabox merged commit 0c70c9c into JuliaCollections:master Oct 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implemented SparseIntSet #533

implemented SparseIntSet #533

louisponet commented Sep 28, 2019

codecov bot commented Sep 28, 2019 •

edited

Loading

oxinabox left a comment

oxinabox left a comment

louisponet commented Oct 1, 2019

oxinabox Oct 1, 2019

louisponet Oct 1, 2019

oxinabox Oct 1, 2019

louisponet Oct 2, 2019

oxinabox Oct 2, 2019 •

edited

Loading

louisponet Oct 2, 2019

oxinabox commented Oct 1, 2019

louisponet commented Oct 1, 2019

oxinabox commented Oct 1, 2019

louisponet commented Oct 2, 2019

oxinabox commented Oct 2, 2019

oxinabox commented Oct 3, 2019 •

edited

Loading

oxinabox Oct 3, 2019

louisponet Oct 4, 2019

louisponet commented Oct 4, 2019

oxinabox commented Oct 4, 2019

implemented SparseIntSet #533

implemented SparseIntSet #533

Conversation

louisponet commented Sep 28, 2019

codecov bot commented Sep 28, 2019 • edited Loading

Codecov Report

oxinabox left a comment

Choose a reason for hiding this comment

oxinabox left a comment

Choose a reason for hiding this comment

louisponet commented Oct 1, 2019

oxinabox Oct 1, 2019

Choose a reason for hiding this comment

louisponet Oct 1, 2019

Choose a reason for hiding this comment

oxinabox Oct 1, 2019

Choose a reason for hiding this comment

louisponet Oct 2, 2019

Choose a reason for hiding this comment

oxinabox Oct 2, 2019 • edited Loading

Choose a reason for hiding this comment

louisponet Oct 2, 2019

Choose a reason for hiding this comment

oxinabox commented Oct 1, 2019

louisponet commented Oct 1, 2019

oxinabox commented Oct 1, 2019

louisponet commented Oct 2, 2019

oxinabox commented Oct 2, 2019

oxinabox commented Oct 3, 2019 • edited Loading

oxinabox Oct 3, 2019

Choose a reason for hiding this comment

louisponet Oct 4, 2019

Choose a reason for hiding this comment

louisponet commented Oct 4, 2019

oxinabox commented Oct 4, 2019

codecov bot commented Sep 28, 2019 •

edited

Loading

oxinabox Oct 2, 2019 •

edited

Loading

oxinabox commented Oct 3, 2019 •

edited

Loading