-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add isdisjoint
#34427
Add isdisjoint
#34427
Conversation
function _isdisjoint(l, r) | ||
hasfastin(r) && return !any(in(r), l) | ||
hasfastin(l) && return !any(in(l), r) | ||
haslength(r) && length(r) < FASTIN_SET_THRESHOLD && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note, in issubset
the conversion to Set
can be done only if haslength(r)
, here you made the opposite choice... Was it for a specific reason? (I didn't think much about it, but indeed converting to Set
when length is unknown seems like a better choice to me; what about updating issubset
then?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, converting to Set
in general seemed like a good idea to me, with the exception of a fast path when the thing we're iterating over happens to be short.
base/abstractset.jl
Outdated
return !any(in(r), l) | ||
return !any(in(Set(r)), l) | ||
end | ||
if haslength(l) && haslength(r) && lenght(r) < length(l) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes lenght
is not constant time (e.g. BitSet
), and in some cases it might be computed twice here. Not a big deal at all (I don't have any example where this is an actual problem (this requires haslengh
, !hasfastin
and non-fast length
)), but if this can be easily re-arranged to not compute the length twice, why not!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we generally assume that length
is constant time. Otherwise these checks are unlikely to be profitable. I don't think it's worth the extra complication. We're already doing a quadratic operation here.
test/sets.jl
Outdated
for S in (Set, BitSet, Vector) | ||
for (l,r) in ((S([1,2]), S([3,4])), | ||
(S([5,6,7,8]), S([7,8,9])), | ||
(S([1,2]), S([3,4])), | ||
(S([5,6,7,8]), S([7,8,9])), | ||
(S([1,2,3]), S()), | ||
(Set(), Set()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be S()
. Or does it not work properly with BitSet
and Vector
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, sorry
This was added in #13192, but the PR became stale. We've also changed a bit how we write these kinds of algorithms, so make use of the tools we have (e.g. `fastin`).
* add isdisjoint (JuliaLang/julia#34427) * fix formatting * Use correct version in README entry * Add missing link to README And sort/unify the entries using NEWS-update.jl Co-authored-by: colinxs <me@colinxsummers.com> Co-authored-by: Martin Holters <martin.holters@hsu-hh.de>
This was added in #13192, but the PR became stale. We've also changed
a bit how we write these kinds of algorithms, so make use of the tools
we have (e.g.
fastin
).