-
-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation faults and memory corruption from using nokogiri with libxml-ruby #1426
Comments
Related libxml-ruby issue: xml4r/libxml-ruby#116 |
Trying right now to reproduce given your information above. |
Aha, I'm able to reproduce this. Thanks so much for putting in the time to isolate this with a test case. As you can probably see from some of the other tickets, reproduction isn't always easy. I'll try to diagnose root causes; but no promises. Will let you know what I find. |
Possibly related to #1270 |
valgrind says:
|
Alternatively, finds this:
|
and another, this time with Nokogiri 1.6.8.rc2:
|
Perhaps interesting is the fact that I can't reproduce this segfault with Ruby 1.9.3, but can easily reproduce with Ruby 2.0, 2.1, 2.2, and 2.3. |
Hmm. This definitely has to do with the fact that libxml-ruby registers a global callback, invoked on all nodes (even nokogiri's) when the node is GCed. Still trying to nail down the exact circumstances when memory gets clobbered. |
I want to repeat that above statement, in case anyone missed the importance of it: libxml-ruby registers a global callback. That is, libxml-ruby makes the assumption that no other users of libxml2 exist in a process, which is IMHO bad behavior. Historically, Nokogiri has tried to work around this behavior, with mixed success (dating back to as early as 2009 -- see #33). We're doing our best, even today, but I want to make it clear that we're attempting to hack around another library's bad behavior. I wonder if the libxml-ruby maintainers care about this? A quick peek at the commit log doesn't reveal any indications that they've ever tried to work with Nokogiri. But like I said, still looking at this with the hope of once again working around libxml-ruby's global callback behavior. |
Glad you were able to reproduce the issue as well. Your steadfast positivity deserves a 🏆 in the face of recurring one-sided incompatibilities. Ideally, we wouldn't use libxml-ruby, but in pragmatic practice you can only control your transitive dependencies so much. |
related to #1426 which has a nice test case I'd like to keep as part of our suite.
@bbergstrom I'm going to cut a release candidate for you to try out. Will update when it's available, sometime tonight. |
v1.6.8.rc3 is up on rubygems.org. Let me know how you get on. |
Great! Trying it out today and will report back. |
We tried out 1.6.8.rc3 on our Ruby 2.2 Rails application for about 5 hours and have seen a 99% reduction in segfaults. :D The 2 that happened have similar segfault traces that we were seeing before, but it is hard to say with any certainty if they at all related. I think you fixed the main cause of the issue we were experiencing. If there are any further issues, I will file new tickets for those. Thanks again for your quick work on this issue. |
Great to hear. If you can figure out a reproduction test case for the remaining segfaults, On Wed, Feb 17, 2016 at 4:33 PM, Brian Bergstrom notifications@github.com
|
I had the same issue but I went through a journey to find solution. It started with our Passenger worker jamming and not responding. After a lot of logging and testing we discovered that Passenger jams while we are parsing XML and returns 502 status code and leaves a segmentation fault in logs. We contacted Passenger Development Team, and with their help we concluded that libxml-ruby was the issue. We rewrote our code only to use nokogiri and now it works. Special thanks to Passenger Development Team. Note: Wrote this to make it easier for other people with similar issue to find this solution faster. Keywords: libxml-ruby, nokogiri, passenger worker jams, xml parsing |
We have had the same issue. Nokogiri 1.6.8.rc3 (Ruby 2.1.8) halved the number of segfaults. xml4r/libxml-ruby#118 has (finally) erradicated the segfaults. Keywords:
|
Nice work all. What's the global handler on the libxml side causing issues? Happy to look at removing it - or is that already done in xml4r/libxml-ruby#118? |
We have been experiencing memory corruption in our Rails application which depends on nokogiri (1.6.7.2) and libxml-ruby (2.8.0). This memory corruption manifested itself in seemingly random segmentation faults with stack traces to nearly every part of code in our application and its dependencies.
After upgrading or removing nearly every gem with a C extension we were able to verify it would go away when we removed a gem that dependent on libxml-ruby. Upon investigation into libxml-ruby segmentation faults we came across similar issues of #895 and #881 and #1364 . That issue was patched some versions ago, but it appears that a similar issue still exists.
I am able to reproduce on Amazon Linux (RHEL/CentOS based distro) but not on OSX with this script 3/4 of the time it executes.
Here is a sample of the stack traces that result.
We hope to eventually remove our dependency on libxml-ruby as we use nokogiri in our codebase, but a required dependency currently forces libxml-ruby into our project as well. A patch would be great for compatibility and for anyone else that may encounter this convoluted issue. We had to spend a lot of time troubleshooting this issue as none of the segmentation faults that happened in our systems pointed to nokogiri or libxml-ruby.
TIA
The text was updated successfully, but these errors were encountered: