-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Conversation
Add initial floodsub implementation
WIP: feat/pubsub \o/
8a51eb3
to
3b365f7
Compare
Things are starting to look good, right now there is only an issue with the stress tests, apparently js-pubsub is too slow for 10K messages under 80seconds (crypto is hard)
|
Hey @haadcode: Great day today, and things are looking good...so here are some of the remaining items left to pick-off for tonight's handoff if you're interested / have time...
@diasdavid and @haadcode - thanks for the help and let's get this to the finish line! |
@gavinmcdermott @diasdavid I just fixed all(-1) the tests. They're now running except one! The one that is currently failing is the 10k load test (I'm so happy I added that!). See the changes to the tests here ipfs-inactive/interface-js-ipfs-core@501c304 They weren't running properly when run all together (other tests would trigger So the 10k one that is failing:
This is not the reason. The reason is that the receiving stream doesn't start processing the messages until the sender has sent all 10k. Ie. the 10k messages are first buffered to one stream and only after all of them have been buffered will the receiving stream start getting 'data' events. This takes too much time and the test fails (and fails the rest of them too). If you set the number of messages to send to 2k, the test will pass, so I'm very happy the arbitrary 10k was correct :) The expected result is: messages are sent and received in parallel. You can verify the correct behaviour by looking at the sent/received numbers in the console while running the test. (This is exactly why I added that test because the same thing was happening in js-ipfs-api) I tried to fix it in libp2p-floodsub where there was a TODO for the 2 PassThrough streams to be converted to pull-pushable. I believe fixing the streams here (or elsewhere in floodsub) will result in correct behaviour and a passing test. I'm sure you will have an idea how to fix this. Another small note was: faa6e33#diff-dcea902c16515396959a8b7e3b1ff40eR120. This part (the filtering) should be moved to floodsub, and perhaps floodsub should just implement @diasdavid you added https://github.com/ipfs/interface-ipfs-core/pull/101/files#diff-b84db3db8b011618f99fed4c49d338c6R143 and I was wondering why that is? Are the peers not connected when the callback is fired? If so, we should fix this in wherever it can be fixed so that we don't need that hack (because if we need in tests, the users will need it in their programs). I'm gonna leave this to you now to pick up again. Great work so far everyone! We can land this! :) My plan next is to start trying it out with Orbit and see what is needed. @diasdavid how should the peer discovery work in the browser? Will the peers (that you normally get) just get picked up for pubsub automatically or is there a need for connecting peers manually? |
Ok, tried it with Orbit. The good news is: it works! \o/ 🎉 The bad news is that as a whole, it's not quite working. What I'm observing is that while some peers can send messages through to others, not all of them can. The version I'm using is https://ipfs.io/ipfs/Qmf2AKhyKodLq2nEq6g4rgPV5SGAvAYqKhRfJxRt5syzEW/ (should work for you too). See the videos below (they're sequential). That's what's happening. The last picture is the swarm.peers at the end of the second gif. Messages from 2nd and 3rd peer are not going through to first peer: |
This is unfortunately not the right solution as it, it will kill all the subscriptions silently if one of the subscribers decides to cancel. @gavinmcdermott had listed the right solution as a TODO item:
Awesome catch, strange though because the stream is being 'drained' as expected https://github.com/libp2p/js-libp2p-floodsub/blob/master/src/mount-floodsub.js#L43
This is a nuance because dialing streams in js-ipfs is non blocking, full context here: https://github.com/libp2p/js-libp2p-swarm/issues/78. TL;DR is: Once this -- https://github.com/libp2p/js-libp2p-floodsub/blob/master/src/dial-floodsub.js#L53-L55 -- gets implemented, no timeout will be needed. Thank you for the visualization! :D |
Ah, you're right, we need to cancel that one particular listener.
Not sure what is meant by this, but you go ahead and find/implement the right solution for this.
Awesome! |
fixed the
|
@diasdavid - I expect some free time this afternoon to knock out the rest of the items on the list. Either way, I'll keep you updated and push the latest! |
I've all the tests passing locally \o/, now waiting for CI. |
557476c
to
4495b3a
Compare
bubbling all the deps to reflect the changes in multistream-select. Apparently CI npm install has different rules of what is a patch update (??!!?) |
Updated all the dependencies, CI should be happy now (tests continue to pass locally at least) |
beb5cf1
to
eb085b0
Compare
last mile is all about unsupported
//cc @haadcode will fix asap |
@diasdavid what did you do about the browser tests? |
eb085b0
to
3082c2f
Compare
pushed fix for #644 (comment) |
The tests were misleading, however after fixing them and more testing, I learned that the implementation works as expected in Node.js 6 and 7, but the message reception clogs at certain point in Node.js 4. Digging through the rabbit hole. |
Ok, just found the issue, the way that floodsub is now set up, makes it so that there is no guarantee that messages written in order are delivered in order correctly, this is both due:
Adding tests to cover this bugs in floodsub while the fixes are underway 🌟 And this is why we have tests in 3 Node.js versions, 2 CI systems and try them in two different browsers. |
d647803
to
0208504
Compare
EVERYTHING IS GREEEN \o/ 💚🍀✅🥝 |
Thank you all, this was a long and enduring process, really happy with all the developments, bug fixes, collaborative API design and all the testing that was done for this API ❤️❤️❤️❤️ |
Continues #610