-
-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libgumbo should allow overriding the default memory functions #2490
Comments
@stevecheckoway commented at #2489 (comment) saying he's hesitant to do this. I may try to set up memory function configuration anyway (in a draft PR), just to see what it looks like. |
I honestly cannot recall if @craigbarnes or I removed that support from Gumbo. As I recall, it lead to passing a pointer to the parser struct to other parts of the code that didn't make sense like the tokenizer. |
Looks like The current behavior of allocation failure inside gumbo is to just abort. A better approach would be to return failure all the way back to the call to parse, cleaning up any allocated memory on the way. But that's a moderate sized engineering effort and I'm not sure the benefit outweighs the effort unless allocation failure becomes an actual problem. Given the raising an exception behavior of |
We for sure have problems with this already, see #1610 and #2096. In practice, an OOM condition for Ruby pretty much means the end of the process, anyway -- and so while we may be leaking memory in this situation, it's not clear whether cleaning it up would have any practical benefits in terms of recoverability. And so we end up with some long-running open issues that are lower priority. 🤷 I had a chat with @tenderlove about this a few months ago, so I'm tagging him in case he has any other comments. |
@stevecheckoway I removed it in craigbarnes/lua-gumbo@8d3d4e4. I didn't really give it much thought at the time—in the context of other projects pulling in my changes—I just removed it because lua-gumbo didn't need it and doing so seemed to improve code generation (perhaps only slightly; I don't recall the exact details). |
Thanks @craigbarnes! I thought it was part of the set of commits I grabbed from you, but I couldn't recall for sure. @flavorjones, I'm not totally opposed to doing this, but I think threading an allocator through the gumbo parsing code is probably not a great approach unless we really need that flexibility. One approach that may be reasonable would be to do something like
rather than defining Here's an example of the situation I was concerned about. That repo has a simple C extension that tries to allocate 200 MB of memory, 1 MB at a time. When run with a low I think there are a few alternative approaches we could consider, but if you're okay with just leaking memory in this unusual situation (at least for now), then that's okay with me. |
After chatting with @tenderlove about this, the primary advantage of using However, my understanding is that the memory allocated in libgumbo is freed once the libxml2 document is constructed (and the act of construction via methods like |
Also - tenderlove confirmed for me that |
Today, gumbo-parser/src/util.c uses malloc, realloc, and free directly.
It would be a nice feature to be able to override those memory functions the way libxml2's xmlMemSetup allows, so that we can make it use
ruby_xmalloc
and friends if we choose to.See #2480, #2481, #2489 for context.
The text was updated successfully, but these errors were encountered: