Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import truffleruby patches #1882

Closed
2 tasks
flavorjones opened this issue Mar 6, 2019 · 15 comments
Closed
2 tasks

import truffleruby patches #1882

flavorjones opened this issue Mar 6, 2019 · 15 comments

Comments

@flavorjones
Copy link
Member

flavorjones commented Mar 6, 2019

The TruffleRuby team is maintaining some patches to Nokogiri that look like they should be pulled in. A chat with one of the maintainers indicates there are changes that are either working around bugs in Nokogiri or else are pretty trivial:

  • some patches related to the number of C function args
  • and also some switch statements

The patches are maintained here: https://github.com/oracle/truffleruby/blob/master/lib/cext/patches/nokogiri_patches.rb

Chat with @chrisseaton here: https://gitter.im/graalvm/truffleruby?at=5c7fac14f895944c084cb856

This story is to:

  • audit the patches and determine which ones we want to upstream
  • make the changes to Nokogiri
@chrisseaton
Copy link

@ aardvark179 can go through these patches and think about what it makes sense to upstream. Thanks very much for the offer.

@aardvark179
Copy link

Sorry for the slow reply. It would definitely be useful for nokogiri to take some of the patches. The follow seem like excellent candidates:

  1. The changes to xml_node_set.c
  2. The changes to xml_document.c

The changes to xml_xpath_context.c were introduced because we wanted the C stack to be unwound correctly if an error was encountered, which was hard for us to guarantee otherwise. It certainly shouldn't hurt to adopt these changes.

The remaining patches are to work round an issue with calling a managed function from native code with varargs. It would be great if nokogiri could put these changes in, probably behind an #ifdef TRUFFLERUBY.

The other really useful thing would be to default to using system libraries with TruffleRuby, we don't yet support linking static libraries and have had issues in the past with picking up a the .sos from the gem directory.

@flavorjones
Copy link
Member Author

Looks like patches have moved to https://github.com/oracle/truffleruby/blob/master/lib/truffle/truffle/patches/nokogiri_patches.rb

I'm hoping to find some time to work on this soon, especially now that we've got TruffleRuby CI coverage.

@flavorjones flavorjones added this to the v1.11.0 milestone Oct 1, 2020
@eregon
Copy link
Contributor

eregon commented Oct 1, 2020

Indeed, I think 3 of these patches are causing the 5 remaining test failures (from #2089 (comment)), @aardvark179 could you check if they are still needed and what changes could be done in Nokogiri to not need them?

@flavorjones
Copy link
Member Author

For posterity, in case that build log gets deleted, the errors are:

1) Failure:
Nokogiri::XML::TestSaxEntityReference#test_more_sax_entity_reference [/tmp/build/4d9a0f57/nokogiri/test/xml/test_entity_reference.rb:209]:
Expected: ["Entity 'bar' not defined"]
Actual: ["Warning."]

2) Failure:
Nokogiri::XML::TestSaxEntityReference#test_more_sax_entity_reference_with_absolute_path [/tmp/build/4d9a0f57/nokogiri/test/xml/test_entity_reference.rb:209]:
Expected: ["Entity 'bar' not defined"]
Actual: ["Warning."]

3) Failure:
Nokogiri::XML::TestSaxEntityReference#test_sax_entity_reference [/tmp/build/4d9a0f57/nokogiri/test/xml/test_entity_reference.rb:196]:
Expected: ["Entity 'bar' not defined"]
Actual: ["Warning."]

4) Failure:
Nokogiri::XML::TestSaxEntityReference#test_sax_entity_reference_with_absolute_path [/tmp/build/4d9a0f57/nokogiri/test/xml/test_entity_reference.rb:196]:
Expected: ["Entity 'bar' not defined"]
Actual: ["Warning."]

5) Failure:
TestXsltTransforms#test_0001_should not crash when given XPath 2.0 features [/tmp/build/4d9a0f57/nokogiri/test/test_xslt_transforms.rb:359]:
Expected /decimal/ to match # encoding: ASCII-8BIT
# valid: true
"Generic errorGeneric errorGeneric error".

@eregon
Copy link
Contributor

eregon commented Oct 13, 2020

@aardvark179 and I looked at those failures, and it seems the main issue is that Sulong does not support creating a native function pointer for variadic functions like

static void warning_func(void * ctx, const char *msg, ...)
{
VALUE self = NOKOGIRI_SAX_SELF(ctx);
VALUE doc = rb_iv_get(self, "@document");
char * message;
VALUE ruby_message;
va_list args;
va_start(args, msg);
vasprintf(&message, msg, args);
va_end(args);
ruby_message = NOKOGIRI_STR_NEW2(message);
vasprintf_free(message);
rb_funcall(doc, id_warning, 1, ruby_message);
}

which is stored in native memory and passed to libxml2 in

handler->warning = warning_func;

That seems hard to implement, because Sulong uses libffi (via Truffle NFI) for native calls (when calling to libxml2), and libffi does not support varargs stubs at all.

@aardvark179
Copy link

As @eregon says, the current variadic error functions look like they will be very hard to support. Luckily libxml2 has another way to handle errors that does not require variadic callbacks. I'll try and test how well that approach works and update this issue soon.

@flavorjones
Copy link
Member Author

@aardvark179 Can you clarify? I'm cleaning up the error-handling code right now (see #1610, #2096, and #2097) and would be happy to integrate any suggestions you might have.

@aardvark179
Copy link

So it looks like you should be able to remove uses of the generic error handler and simply rely on the structured error handler. I had to make one small change to test_document.rb at line 540 in relation to this as you now get an Nokogiri::XML::XPath::SyntaxError instead of a Runtime error. Unfortunately libxslt doesn't appear to have an equivalent structured error function, so we can't get rid of the problem entirely.

@flavorjones
Copy link
Member Author

Ah, I see what you're saying. Sure, we can probably drop the generic error handler usage; and I would like to omit the use of varargs in TruffleRuby within Nokogiri's code (rather than via a patch). Let's plan on me doing that as part of the work at #2096 and I'll ping back here when that's ready to go and ask y'all to review it.

@eregon
Copy link
Contributor

eregon commented Feb 26, 2021

@aardvark179 checked and with #2193 + --use-system-libraries=false, and removing the patches for Nokogiri (PR to do that automatically upcoming in TruffleRuby), all Nokogiri tests pass!
That means we should be able to have TruffleRuby passing in Nokogiri's CI.

Performance also seems fine when running libxml2/libxslt on Sulong, so I think we might just default to that at some point (use the vendored libxml2 like for CRuby, but no support for precompiled gems yet).

@flavorjones
Copy link
Member Author

🤔 Wow. I'll make time this weekend to merge, and update CI and the installation docs.

@eregon
Copy link
Contributor

eregon commented Mar 2, 2021

oracle/truffleruby@9a5b420 is the PR I was mentioning above in TruffleRuby. That's now merged, so when using --use-system-libraries=false the patches have no effect.

@eregon
Copy link
Contributor

eregon commented Mar 12, 2021

I just tried nokogiri 1.11.2 and it works well.
We'll default to the vendored libxml2/libxslt in TruffleRuby 21.1 unless we discover a major issue, that notably avoids some headaches of installing extra system packages which has been quite annoying.

@flavorjones
Copy link
Member Author

I'm going to close this, then! Thanks for the amazing collaboration!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants