-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Ord instances for IntSet and IntMap #470
Comments
If nothing else, we should have use lists with unpacked data KeyList
= LEnd
| LCons {-# UNPACK #-} !Key KeyList
deriving (Eq, Ord)
data KeyPairList a
= PEnd
| PCons {-# UNPACK #-} !Key a (KeyPairList a)
deriving (Eq, Ord) And then you have pre-inlined functions to turn an Though what would probably be even better, at least for the data KeyList
= LEnd
| LCons {-# UNPACK #-} !Prefix {-# UNPACK #-} !BitMap KeyList
deriving (Eq)
instance Ord KeyList where
compare LEnd LEnd = EQ
compare LEnd _ = LT
compare (LCons prea bita la) (LCons preb bitb lb)
| prea < preb = LT
| prea > preb = GT
| otherwise = case compare (revNat bita) (revNat bitb) of
EQ -> compare la lb
c -> c This way, there's only one entry per EDIT: However, for those, the best bet may be just rewriting it so that the fold over the container constructs the desired string directly, such as: instance Show IntSet where
showsPrec p is = showParen (p > 10) $ \str -> showString "fromList [" $ either id id $ foldr showOne (Left (']':str)) is where
showOne !i r = Right $ shows i $ either id ((:) ',') r I'm pretty sure the definition for instance Show a => Show (IntMap a) where
showsPrec p im = showParen (p > 10) $ \s -> showString "fromList [" $ either id id $ foldrWithKey showOne (Left (']':s)) im where
showOne !i a r = Right $ showChar '(' $ shows i $ showChar ',' $ shows a $ showChar ')' $ either id ((:) ',') r This means that we don't have to worry about creating and then destroying intermediate lists, boxed |
I'm not so sure about instance Ord KeyList where
compare LEnd LEnd = EQ
compare LEnd _ = LT
compare (LCons prea bita la) (LCons preb bitb lb)
| prea < preb = LT
| prea > preb = GT
| lbm <- lowestBitMask (bita `xor` bitb)
= compare (bita && lbm) (bitb && lbm) <> compare la lb Still, it would be nice to use the |
Benchmarks? Ideally, derived from actual use cases. Where do we need Ord on Sets? When we take a Set (or Map) of sets. I have something at https://gitlab.imn.htwk-leipzig.de/autotool/all0/blob/master/fa/src/Autolib/NFA/Det.hs but it needs to be extracted before being useful. |
stand-alone NFA determinisation in this branch: https://github.com/jwaldmann/containers/commits/instance-Ord-IntSet Example profile - see below. I think it shows that indeed, Data.IntSet.Internal.compare is exercised here.
But - it really is never a good idea to use NB: I wonder how poeople do implement the powerset construction (from NFA to DFA). Perhaps the answer is "they don't" (and go from regular expression to DFA directly). So, are there other applications where one needs an (optimized) |
I am trying this https://github.com/jwaldmann/containers/blob/instance-Ord-IntSet/containers-tests/benchmarks/OrdIntSet.hs#L76 It seems to be working for natural numbers, not for negative keys, since they break the assumption that in
But these cases should be easy to detect? I could use some ideas here (also for |
Yes! I get a nice speed-up (around 3), and much reduced allocation. |
Lovely! In principle, the negative weirdness should be limited to the
roots. If you're having trouble, I can try to help, but you likely know at
least as much about it as I do!
…On Sun, Jul 21, 2019, 11:34 AM jwaldmann ***@***.***> wrote:
Yes! I get a nice speed-up (around 3), and much reduced allocation.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#470?email_source=notifications&email_token=AAOOF7IIM63Y64366FEGRG3QAR6XVA5CNFSM4EJ75JBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2OF5YQ#issuecomment-513564386>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOOF7KDIPW5O4LYDQ3ZPCDQAR6XVANCNFSM4EJ75JBA>
.
|
In my benchmarks (NFA -> DFA) I think the IntSet always fits into one |
I now have a version that I think is correct (by enumerative testing, it agrees with the original version, and for the NFA to DFA conversion, automata have expected size). Performance is roughly
I am sure the code can be golfed and micro-optimised (using equivalences of bitwise operations). I don't have much experience with that. |
Can you put together a pull request? Or better, two: make the first one add
your benchmark.
…On Mon, Jul 22, 2019, 9:56 AM jwaldmann ***@***.***> wrote:
I now have a version that I think is correct (by enumerative testing, it
agrees with the original version, and for the NFA to DFA conversion,
automata have expected size). Performance is roughly
- for small trees (Tip only): 30 percent runtime
- for large trees (each Tip is singleton): 70 percent runtime (both
versions walk the tree in the same way but I don't do any allocation)
I am sure the code can be golfed and micro-optimised (using equivalences
of bitwise operations). I don't have much experience with that.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#470?email_source=notifications&email_token=AAOOF7MKRFNKHYIVC35C7CDQAW4A5A5CNFSM4EJ75JBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2P75TI#issuecomment-513801933>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOOF7NWOSLIKZ5X5FXZ5XLQAW4A5ANCNFSM4EJ75JBA>
.
|
actually it's three parts
|
I didn't realize there wasn't a correctness test yet! Ouch!
…On Mon, Jul 22, 2019, 10:10 AM jwaldmann ***@***.***> wrote:
actually it's three parts
- implementation
- test for correctness (compare s t == compare (toAscList s)
(toAscList t))
- benchmark
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#470?email_source=notifications&email_token=AAOOF7PSSISZCNJSVSDBSMTQAW5T3A5CNFSM4EJ75JBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2QBJ7Y#issuecomment-513807615>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOOF7MUSK2JERMIJEZ2SN3QAW5T3ANCNFSM4EJ75JBA>
.
|
Are these properties (that Eq and Ord of IntSet go via toAscList) announced anywhere? If not, that deserves a separate discussion? |
Oh I see now that
but it's hidden deep down the file. I'd expected it to go at the top (following the order of the declarations: And it really should be |
that avoids toAscList and walks the tree directly. See haskell#470
* Test that instances for Eq and Ord agree with going via toAscList * Add benchmark for "instance Ord IntSet", using "Set IntSet" * Improve implementation of "instance Ord IntSet" that avoids toAscList and walks the tree directly. See #470
The current
Ord
instances forIntSet
andIntMap
don't take advantage of the way these types are structured at all. We may therefore be leaving some performance on the table. In particular:The text was updated successfully, but these errors were encountered: