-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temperature check on possible memory leak in Holistic JS solution #1937
Comments
I see this. Will check it out. |
I have gone through our testing on Holistic and I see no memory leaks on the basic solution. However, I may still be able to help. WASM requires that any class created must subsequently be deleted to avoid memory leaks. Can you check to see if it's possible that you are creating multiple copies of Solution without also calling |
Thanks for the quick reply @mhays-google! Your comment helped me find out that I was indeed instantiating multiple copies of Holistic in one of my flows. I'm not sure yet if this is also the root cause for the flow that originally led to my question though, but will continue digging and circle back. Just wanted to respond sooner than later to thank you for the useful advice! Also just for my curiosity: not sure if this is expected since I don't have visibility into the |
Hello again, @mhays-google After making the above fix and doing some more profiling, it seems like there's still some seemingly unbounded memory allocation happening. Here are some memory snapshots from my application: The culprit is always this When I check the retainers that are keeping this Interestingly, I'm able to observe the same memory issue and retainers for this demo: https://mediapipe.dev/demo/holistic. Potentially notable similarities between my project and that demo: they both use the CDN method of import and therefore the Tangentially(?), I've since also tried the following to address this and neither have helped:
Even more interestingly, the issue also seems to be present in https://mediapipe.dev/demo/holistic_remote/, albeit to a less extreme degree(?). Noteworthy: the implementation for this demo seems to differ from the previous based on what I can tell from the heap snapshot and the lack of Do you have any more clues / thoughts as to why this might be happening? Sorry for the block of text, just trying to give you as much information as I can about what I've tried so far 😅 |
A few things:
When you set a reference to Finally, it may be the case that heap memory grows but does not shrink. If it does, we are not taking advantage of this (the same happens often in Java applications where they get the opportunity to grow, but never compact their own memory to allow shrinking it again). Ok, with that out of the way:
Can you check if any of these are the case? Thank you! |
Your comment has been useful to understand more about whats going on @mhays-google. I have a memory leak and have been trying to remove it for the past few days (the same issue with
Isn't that done in C++, so shouldn't the mediapipe library be deleting these? Im not sure why deleting any variables in JS are going to affect this.
Is this not the definition of a memory leak? Isn't this the problem all along? Could you clarify what you mean. If the mediapipe library keeps growing the heap with more arrays over time, I don't see how I can solve this in my application.
Where can you infer that PS: I also wish #1956 was not closed, because the issue was not addressed. @tyrmullen assumed it was one thing and closed it immediately. Though it is very related to this so I will live in this one now 😇. In that issue, I used the face mesh library which also has the same issue, but I have since moved to holistic too. I can confirm both libraries have this issue. PS2: This is probably a good time to say if the js solutions were open source we'd have a much quicker feedback loop, as I can probably go digging in there to find potential issues. Perhaps you could give a brief overview of how it works, I'd love to contribute to the libraries, its quite interesting stuff 😃. |
Even if the memory being leaked is in C++ code, that doesn't necessarily mean that the MediaPipe library can fix this. As mentioned by mhays-google, one key problem is that JS has no finalizers, which means it's unfortunately impossible for C++ code to know when garbage is collected. The workaround for this is usually to force users to call some sort of .close() function as soon as resources are ok to be freed, on any JS object which is backed by C++ memory. Also, the heap memory growing and never shrinking is not necessarily a memory leak, as long as there's a cap on the growth. For example, if the C++ code starts out needing 10MB of heap space, but needs 20MB when a hand is visible, the heap size may leap to 20MB and then stay there, even if no hand is visible, so it technically only needs 10MB. This can be expected behavior rather than a leak, since the next time you need 20MB, no more memory is allocated (and in fact, you usually don't want to be freeing and reallocating large chunks of memory on the fly, for performance reasons). However, for your specific issue, I'm not actually convinced that the leak is even in the C++ memory... I agree with your reading that there's only one C++ heap shown there, but 27MB doesn't seem too unreasonable pre-garbage-collection for a C++ heap size; is that particular value increasing steadily over time in your measurements? Furthermore, when I run the smallest Holistic demo (without CodePen UI) here, and then just look at Task Manager (right clicking on the headers there so I can enable "JavaScript memory", it looks like the "live" JavaScript memory never increases beyond a certain point (which would indicate a leak there). However, the overall memory does appear to be increasing over time, even on this isolated MediaPipe demo. So this makes me wonder where that other >100MB in your snapshots (which seems to be growing continuously) is coming from-- perhaps there are some DOM elements which are being kept alive in JS/TS code (or console.print logs) accidentally? This will definitely require additional investigation. On an unrelated note, I actually closed the other issue not as solved, but rather to deduplicate, precisely because it sounded like effectively the same problem as here, so it seemed more useful/efficient to have everything live on the same thread moving forwards (to "break the tie" I usually just choose the first issue reported, and close the others). As for open-sourcing, this is a subject of much ongoing discussion and work. More information can be found on issues 1408, 1741, and for some lower-level BUILD details 877. |
Thanks for your interesting reply 🙃
I don't think we're expected to call
Isn't the other 100MB the same 100MB we're talking about, they are ArrayBuffer's backed by PS: Okay I understand why it was closed, but I replied there and I was afraid everything was going to waste in that issue. |
No, you shouldn't need to call .close() on every frame. I was just explaining why the solution might not always be in C++ even if the apparent leak is in C++ memory; but aside from the Solution itself, I don't recall any of the intermediate results needing .close(). In general, anything which is a native JS data type (or a structure/dict/array of those) will not require a .close() call-- it would only be for something our API returns which keeps its functionality in C++ (so interacting with it would generally be through function calls-- like the Solution). As for memory layout, the memory in C++ is basically a special siloed-off ArrayBuffer. JS can peek into this and even manipulate it, but to the C++ code, this is the only memory it knows. And I suspect you're right about the memory profiling differences-- Performance shows JS memory, while Memory dump shows everything. In fact, if you check out Task Manager (with "JavaScript memory" as mentioned above), then I'd expect the "memory used" readings would grow and shrink with Memory dump values (but will likely be larger, since this covers stuff even the Memory dumps don't), while the "JavaScript memory" readings would correspond with Performance values. And FWIW it looks like C++ memory might not count as "JavaScript memory", although I'm not quite sure where the distinction is drawn. From your screenshot above, I see two different large chunks of memory, one which is ~27.5MB (@49035), and one which is positively massive (almost 1GB! (@419095)). The expanded trace shows that the wasm Module (@87191) is causing ~27.7MB to be retained, 27.5MB of which I'm assuming would be from that smaller chunk of memory (@49035). But I believe the wasm Module retains the entire native C++ heap, so it seems the C++ memory isn't growing beyond 27-ish MB. So the question to ask now is "where is that ~1GB coming from?". When I repeat your experiment with the various solutions, I only see a memory increase over time for "Holistic" and "FaceMesh", which makes me think this is facemesh-specific (so I'll look into FaceMesh alone next, to cut out some of the extra noise). Curiously, though, my memory dump details look a little different than yours. For me, the increase appears to clearly occur in the wasm Module's C++ memory area, indicating a leak there. TL;DR: I'm not sure why we're getting different readouts, but at least for me, I seem to see a C++ memory leak somehow related to FaceMesh-- we don't do anything special for this module that I can think of, though, so I'm not yet sure what could cause this. |
That's interesting, I haven't done a lot of WASM stuff but I'd be interested to help out.
I was trying to investigate that, but the codepen is broken, have the library/ model files have been removed?
|
Whoops... give me a moment. I staged a new holistic demo and the models got really big. Also, thank you everyone for the patience. I did find a memory leak. We return landmarks but as an array, but we don't delete the array. I'm going to be eating crow here while I fix this. I have uploaded a version of holistic that won't 403. I'll have the memory leak fix up later today. |
Thanks @mhays-google! No need to eat crow haha, I'm just thankful that you and your team have made this model available at all! |
Was the fix included in yesterday's release I'm also curious why there are so many digits in the version string 👀 |
@mhays-google Looks like the holistic JS solution is still broken (version 0.3.1620694839):
|
Hi @sgowroji, we're still facing this issue, could you please re-open it? The fix was not released or confirmed by @mhays-google. The last we heard was 2 weeks ago he was going to release a fix, but unfortunately we haven't heard anything yet.
Maybe you're mistaken. @codylieu was just saying "thanks" for working on Mediapipe/ this model, its a great library! |
@sgowroji thanks for the fast response -- I added the notes here b/c the version referenced by michael earlier in the thread meant to address the memory leak also seemed to break the holistic js model altogether. Given that it's been about 2 weeks since the latest release and the fact that production is broken, I figured I'd add the stack trace information if that might help in debugging. |
The fundamental issue here is that the underlying memory management system is quite brittle and makes debugging leaks rather tricky. This is definitely something we'd like to fix (and we have several ideas for doing so), but for most Solution APIs this isn't as bad a problem, since the data being brought from C++ back to JS is so very small (I don't think this issue is face_mesh- or holistic-specific; instead I believe that the face mesh data just happens to be the only data large enough to produce a noticeable memory leak within a reasonable amount of time). That said, because face mesh data is so large, this results in MB/s being leaked, which is quite serious. So while a memory management system restructuring will likely not occur for a while yet, I will attempt to track down and submit a patch in the next day or two to at least solve this particularly problematic case of face mesh data leakage when transferring from C++ to JS. |
I've submitted a patch-- the next JS Solutions update to "face_mesh" and "holistic" should no longer have this issue. Please confirm or let me know if that's not the case when the next release occurs. |
@tyrmullen Thank you so much for the update, Hi @afogel & @codylieu, Please test the above in the next release and reach us if you still face the similar issue. Thank you! |
@sgowroji I just checked the npm repository, but haven't seen another release yet (The latest release I see is v. 0.3.1620694839, published a month ago)? Do you have a sense of when it will be released? Thanks! |
@sgowroji @tyrmullen Same question as @afogel. Still not seeing any new releases on: https://www.npmjs.com/package/@mediapipe/holistic. Any idea when a version with @tyrmullen's fix might be released? |
It should be there now in version 0.4.1625245411. Sorry for the delay and confusion. |
@chuoling @sgowroji I'm not able to check to see whether the fix is working because I'm having the same issue as before on version 0.3.1620694839, though the error appears to occur in the
|
@afogel I think we should open a separate issue for the error you're running into, since it seems unrelated to the memory leak, and we have other ways to verify the leak-fix (so it should not be blocking validation). For example, the facemesh demo mediapipe.dev/demo/face_mesh should no longer have this severe memory leak. |
done! #2267 |
I know this issue is closed, but I am struggling with memory leaks as described above. I am using the face mesh demo run on local server (no code pen). I changed nothing about the code. My versions are as follows:
I observed significant increases of JS heap size over time. I sampled every minute for 10 minutes and observed these increases.
I also observed this increase monitoring heap size over the course of 40 seconds. Any suggestions on how to resolve this? |
@tyrmullen the memory leak originally described by @codylieu and now by @crecord seems to exist again as well in pose@0.5.1635988162 cc: @mhays-google |
Hmm... I've run code.mediapipe.dev/codepen/face_mesh in full screen mode
and in the editor. I let it go for 10 minutes. Not only was it not
leaking, but it was also hovering around 10 megabytes.
We definitely had leaks in the past (months ago), but we've addressed
this. Is it possible that you have a bad version downloaded?
Specifically, in your constructor, are you maybe feeding a bad prefix to
`locateFile`?
|
Unfortunately, I can verify that both of those demo scenarios leak on CodePen as well (current facemesh CodePen with no changes, and pose with version bumped to @0.5.1635988162). Profiling makes it seem that there may even be two distinct leaks: one in C++/Wasm memory ( @mhays-google 10MB sounds very low for the facemesh demo (even when first starting up); maybe double-check to see if your memory profiler was accidentally set to profile the CodePen JVM and not the MP demo (the demo JVM should be the larger of the two)? Also, you might not see most of the leaks unless you actually take heap snapshots, since the live JS heap report just on the memory profiler tab seems to ignore things like the C++/Wasm memory for some reason. |
@tyrmullen and @mhays-google any update on this? |
@tyrmullen @mhays-google |
@tyrmullen and @mhays-google any update on this? thanks! |
I haven't yet had time to try to track this down, but I've at least isolated the regression. Version 0.4.1625245753 had my initial large C++ leak fix, and no C++ memory appears to be leaked up through version 0.4.1628005669. However, versions 0.4.1629159166 and now 0.4.1633559619 (most recent version as of this posting) both have a large C++ memory leak again. For now, unless you need the new features, I'd recommend using one of those older versions as a workaround. |
Is there any update? We met the same problem of momory leaking with facemesh in our web application, yet we do need the iris result so we can't work around with 0.4.1628005669. Is it possible to fix the problem? |
Hello! I'm using the Holistic web solution in a WebRTC streaming application.
We've occasionally seen Holistic processing crash in a way that seems indicative of a memory leak:
I'm currently unsure about whether the Holistic processing crash is the cause or a symptom of another issue. I've done some memory profiling, but haven't found a reliable way of reproducing it yet.
Since I don't have too much visibility into the Mediapipe JS internals, I was just hoping to get a temperature check on whether the Mediapipe team thinks:
Basically just trying to determine where to invest additional debugging / investigation efforts.
FWIW I also came across this thread https://bugs.chromium.org/p/chromium/issues/detail?id=1174675 which indicates there could be some memory leak issues in Chromium that can affect use cases like WebRTC, but the rate of leakage described there seems too slow compared to what I'm perceiving in my application.
Thanks in advance for your help! Sorry that I'm not able to provide any more details besides that single stack trace. Please let me know if you need any additional information and I'd be happy to circle back with it.
The text was updated successfully, but these errors were encountered: