-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Firefox - crashes with excessive thread activity #950
Comments
I've seen this has started happening in Orbit, too (see orbitdb-archive/orbit-web#9 (comment) and note that the bug itself is not related). "That said, I've noticed Orbit has started randomly crashing (both Chrome and FF) and on Firefox it just kills the browser completely (hangs). This is seemingly random and I haven't been able to debug it yet." This is indeed the same behaviour as described above by @mitra42. Important to note here is that Orbit's code (deployed at orbit.chat) hasn't changed since April and as far as I can tell, only the browsers have gotten updated. So that makes me think that a) we always had this problem or b) it's something shared between the browsers (eg. webrtc). Or both. There's an open bug for Firefox at https://bugzilla.mozilla.org/show_bug.cgi?id=1389812 which also has some interesting details re. webrtc usage, perhaps related. |
Thanks for reporting. This mostly due to WebRTC being hungry for machine resources and the fact that so many operations are happening on the Main thread. Solutions for this problem are:
|
Another thing that is specially annoying is Chrome's aggressive resource throttling and the lack of way to disable that (other than having things on a Service Worker). |
OK - this sounds pretty bad ... If I understand you.
And moving to a WebWorker seems like a massive re-architecting of the upper layers to get around this bug in lower layers, and I'm not sure it would even work since the problem is that the WebRTC (or how IIIF/IPFS is using it) causes it to consume huge amounts of resources. I find on Chrome our demos crash within about 5 minutes, just sitting there with no UI or interaction with IPFS other than having started it. |
@mitra42 seems that my points were missing some context. To clarify:
Connection Closing is a feature that any P2P network requires, it is currently a bottleneck both in go-ipfs and js-ipfs that we've been actively working towards a scalable solution. We've multiple ideas and are discussing how to implement it. You just happen to notice it faster in js-ipfs because a) The browser has way less resources than a go-ipfs running a desktop and b) WebRTC is super resource intensive, try any video chat app that uses it nowadays and your laptop fans will start flying :) You can implement an application level connection closing policy and close connections that no longer interest you.
Yes, I proposed that and the reason is that it would be a way to introduce the nodes that are interested into the same content without having to connect to every other node in the network that is being booted for other apps. It is a good policy to connect nodes that are interested in a topic faster (i.e orbit with orbit nodes, IIIF with IIIF nodes, etc). Content Addressing will still work.
WebRTC is resource intensive and browsers do not like that. You could actually turn on WebRTC transport and disable WebRTC-Star discovery that connects all the nodes automatically, enabling you to pick the ones you are using. We are developing Relay, which will enable Browser nodes to connect to any other node in the network through a WebSockets connection, reducing vastly the resources consumption.
Separating the View layer from the Transport Layer shouldn't come as a foreign idea, that already happens for all the other transports in the Browser. In fact, we are already working on a Browser integration that would put IPFS in a background process so that you don't even have to load it in your app and one of the other solutions is exposing IPFS through an Extension. There is a lot of work ahead of us, but progress moving steadily and today we are able to run full P2P protocol on the browser, things will just get better moving forward :) @mitra42 if you and your team would like to work more closely with us, either by creating tests, identifying performance benchmarks and so on, please let us know, we appreciate help from our open source contributors and I'm happy to give you pointers on how to improve certain things :) |
How can we do this at the app level ? We aren't opening and closing connections at the app level, this is happening entirely inside IPFS library code. We can run our own server, but as I said there will still be a scaling problem when the number of nodes on any app exceeds enough to bring WebRTC to its knees.
Maybe I'm misunderstanding something, but my assumption was that if we have the app connect to a separate webrtc server and store a block, and someone on some other app refers to the same block elsewhere (by its hash) will they still get it ? Re your point on webrtc, I want to be clear - we aren't explicitly using WebRTC from choice and certainly aren't explicitly conncting to other nodes since anything will be running in the browser. I was told (by Matt) that the best way to do what we needed was to use IIIF and the IIIF examples used WebRTC (I'm pretty sure that's where the "WebRTC" link came from).
Agreed, and they are separate in our framework, its the Transport Layer that is causing the problems ! Its still a massive job to rearchitect something to use WebWorkers. @diasdavid - I'd be happy to jump on a call to figure out to make this move more smoothly, and be more of help. |
You can disable
The number of open WebRTC Connections needs to be under control more strictly than other types of connections. Relay will help this too because we will be able to multiplex multiple connections to multiple peers over one socket.
And that was a good suggestion. The IIIF team enabled shared annotations over IPFS using IIIF-DB which was designed to fit the needs of the project. I believe @flyingzumwalt was suggesting to use IIIF-DB as an inspiration and a learning tool of how to create a CRDT powered DB over IPFS and not as here is the off the shelf solution that solves all your problems. Bare in mind that IIIF DB is still under active development. You read me more about CRDT and the work being actively developed at https://blog.ipfs.io/30-js-ipfs-crdts.md. As we talked in the past over email some months ago, it would be excellent if you could share with us your goal for the application and the architecture that you are building, I feel that there is some shoehorn happening here and that is creating some obstacles. Happy to chat more next week :) |
Yes - our goal was, and still is, to build on top of IPFS, however pretty much everything we tried didn't function or didn't function as documented, or wasn't available on JS-IPFS (e.g. IPNS) or depended on something else which didn't work. That's how we ended up with IIIF, because something else (I think Y-connector) wasn't working as documented. Our docs - very much a work in progress so are still on Google Docs, but it doesn't really cover the key primitive we are trying to get IPFS to do - which works, except that it crashes because of this WRTP issue. Lets setup a call for next week - I have a lot of flexibility. |
Its unclear from those links what the implication of disabling discover is. If we disable discovery as suggested above will the pubsub feature still work, and will a resource stored at one node be viewable at another? |
@mitra42 PubSub will work still. You can learn more about what Discovery is and how things get plugged in the Tutorials of libp2p https://github.com/libp2p/js-libp2p/tree/master/examples
I'll review your Google Doc over the weekend and ping you to chat next week :) |
Is this still an issue in latest master (without WebRTC)? |
It seems to be happening a lot slower now e.g. 30 mins to crash. To be clear ... we are loading with config options
Which looks to me like its still webrtc for IPFS, and presumably also for YJS, but I think that is what you gave Kyle on Friday (to replace |
There is more perf fixing incoming :) See dignifiedquire/pull-block#2 by @Beanow. The other thing we are working specifically for your use-case, @mitra42, is libp2p/js-libp2p#122. Once that lands, I'll update you on how to configure your node and things should improve dramatically :) |
Just a flag that we are still seeing random crashes and overloads on Chrome that may be related to this bug. Its hard to tell, as the most common case is coming back to a web browser and finding the page has crashed. (Note there is no activity in the page during time away from it, its just IPFS running) |
@mitra42 that's what I would expect as well. The performance improvements in pull-block will be most noticeable when adding files to IPFS. For example ipfs-inactive/js-ipfs-unixfs-engine#187 (comment) Note, you may need to check if your NPM gave you pull-block You mention this happens when idling. So this is probably still WebRTC related. |
Yes - NPM updated to pull-block 1.2.1, Am i understanding you correctly, that if its crashing when idle then changed to pull-block won't effect it, i.e. we can expect idle pages that do essentially nothing other than load IPFS to crash ? |
Is there any progress on this - I'm suprised to see new version of IPFS appearing when with this bug its essentially unusuable on Javascript ? |
@mitra42 In the meantime pull-block has had another performance update, now on Can you confirm if you have the issue using a websockets transport? |
@Beanow , we aren't explicitly using a transport, we are connecting to IPFS via the only version of the config options we've been given that works. I think it came from David via Kyle.
Note ... we are using yjs with the ipfs connector, but this error occurs even before we connct to Y and when our app has done nothing but started IPFS. |
Try https://dweb.me/examples/example_block.html as a repeatable example - mean time to crash the brower tab is about 10 minutes of doing nothing. If you want to repeat it, |
I've done some testing to run both your and my own example with The required code changes are already in progress here. libp2p/js-libp2p#122 |
Yes - and that would be ok if there was at least one configuration that worked in the browser, but currently there doesn't seem to be ANY, so there is nothing for people to build apps on while waiting for IPFS to come up with improvements. It surprises me that any new JS releases are being pushed when there is no working browser version. Its not like that example does anything complex, all it does is start IPFS, sit around, and crash ! |
@mitra42 I appreciate the feedback and believe me, I do feel that pain that is seeing a Browser running out of memory. It is important to note that there are multiple people working really hard towards shipping a solution that will mitigate that issue and also bring lots of performance improvements to js-ipfs. Today we have identified the problem (WebRTC is an unstoppable memory hog), we fixed some other performance issues (i.e pull-block, browserify-aes and Stream API) and we do have two proposals for solving the WebRTC issue: a) Implement ConnManager or b) Offer You can get involved in the solutions as they get implemented or you can be patience and provide support. For example, having examples is definitely very useful. |
Oh wow! I've been having this problem too and I thought I was not correctly setting up @diasdavid could you help me out in that idea? I don't seem to find any documentation to configure my own I have my own node running ( Still I see no activity in the signaling server logs. I am using https://github.com/libp2p/js-libp2p-webrtc-star. |
@AquiGorka let's open a new issue to help get you set up. Nevertheless, did you happen to see these notes? https://github.com/libp2p/js-libp2p-webrtc-star#rendezvous-server-aka-signalling-server. js-ipfs performance and stability is now better, I was successful at running js-ipfs for a while (well over 30 mins) and upload more than 750Mb with #1086 Once that PR is merged, the last mile is: #962 Closing this issue, let's track the development of the fixes in the links above. |
@diasdavid - was there a reason to close this before the problem is actually solved ! Its not clear from either #962 or #1086 that they are about closing the problem of Firefox crashing and as @AquiGorka shows there are other people hitting this issue who are not going to realise that either #962 or #1086 are addressing it. Not - until they work - is it clear that either of those fixes will solve it. |
@mitra42 as mentioned in the comment above and also on my other answer at #988 (comment) + the note from the experiment @Beanow run at #950 (comment), we know that:
|
@diasdavid What config are you using for the working version. Note - we don't choose to use WebRTC, this was recommended by you and Kyle. Is there a recommended better alternative than ...
that works in browsers please let us know. |
@diasdavid unfortunately for most applications not using WebRTC isn't an option, as you lose pubsub support and there's no alternative (websocket-star / circuit-relay) ready for an official release yet. |
As not using WebRTC does not solve the issue, I've opened #1088 to track possible solutions. |
Thank you @Beanow! That's perfect |
Yep. I setup the signal-server accordingly. Issue for configuration advice/help: #1092 |
License: MIT Signed-off-by: Nitin Patel <nitinpatel278@gmail.com>
Type: Bug
Severity: Severe
Description:
The most serious bug - that makes the system unusable - is the extreme thread usage on Firefox. Firefox normally runs with about 60 threads (according to Mac Activity Monitor)
As soon as I open a page which starts IPFS/IIIF it adds about another 40 threads, BUT the Idle Wake Ups starts hitting the 10’s of thousands (three orders of magnitude more than anything else on the box). Load Average (as reported by “top” or “uptime" starts rising and quickly exceeds 100, by which time the machine is running too slow to do much more than kill Firefox.
Is anyone else running js-ipfs in browser, are you seeing this especially with iiif ?
Note - on Chrome the behavior is different - there aren't as many wake ups, but CPU load grows (slower) until it hits a point where Chrome gives the "Aw Snap! Something went wrong while displaying this webpage." error.
Steps to reproduce the error:
The text was updated successfully, but these errors were encountered: