Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JRuby-Nokogiri fails to parse XSLT stylesheet #1078

Closed
denisdefreyne opened this issue Apr 14, 2014 · 4 comments
Closed

JRuby-Nokogiri fails to parse XSLT stylesheet #1078

denisdefreyne opened this issue Apr 14, 2014 · 4 comments

Comments

@denisdefreyne
Copy link

Nokogiri on JRuby fails to recognise <xsl:value-of select="$foo"/> and fails with an exception.

Steps to reproduce

Execute the following code on JRuby:

input = <<-EOS
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
  <xsl:template match="/">
    <h1><xsl:value-of select="$foo"/></h1>
  </xsl:template>
</xsl:stylesheet>
EOS

require 'nokogiri'
::Nokogiri::XSLT(input)

Expected output

% ruby test.rb
%

Tested on MRI 1.9.3.

Actual output

% ruby test.rb
RuntimeError: could not parse xslt stylesheet
  parse_stylesheet_doc at nokogiri/XsltStylesheet.java:166
                 parse at /Users/ddfreyne/.gem/jruby/1.9.3/gems/nokogiri-1.6.1-java/lib/nokogiri/xslt.rb:30
                  XSLT at /Users/ddfreyne/.gem/jruby/1.9.3/gems/nokogiri-1.6.1-java/lib/nokogiri/xslt.rb:13
                (root) at test2.rb:19

Tested on JRuby 1.7.11.

Notes

Removing the <xsl:value-of select="$foo"/> makes the error disappear.

@headius
Copy link
Contributor

headius commented Apr 21, 2014

The actual error message here, hidden by the Java impl, is "Variable or parameter 'foo' is undefined." Is something eagerly trying to parse that value out and look it up?

@headius
Copy link
Contributor

headius commented Apr 21, 2014

Full trace: https://gist.github.com/headius/180d6ffe99d0e6527483

It appears that the XSLT engine is attempting to resolve $foo. Perhaps there's a config to turn that off?

@headius
Copy link
Contributor

headius commented Apr 21, 2014

Ok, so all resources I looked at indicated that this stylesheet needs an xsl:variable or xsl:param tag to declare $foo as a variable that will be provided later. Perhaps libxml is being so forgiving it allows this to be a deferred variable with it's not immediately present?

This patch to your stylesheet makes it parse fine:

--- blah.rb 2014-04-21 08:34:25.000000000 -0500
+++ blah2.rb    2014-04-21 08:34:21.000000000 -0500
@@ -2,6 +2,7 @@
 <?xml version="1.0" encoding="utf-8"?>
 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
+  <xsl:variable name="foo"/>
   <xsl:template match="/">
     <h1><xsl:value-of select="$foo"/></h1>
   </xsl:template>

So the question then is whether libxml is too permissive (which is usually the case) or xalan is too strict. I could not find anything obvious in the W3C spec about when XSL variables must resolve, but I fed your stylesheet into a number of online xslt transformers and they all failed to parse it.

I'm inclined to call this another case where libxml is wrong by allowing the stylesheet to compile in the first place.

denisdefreyne added a commit to nanoc/nanoc that referenced this issue Apr 21, 2014
XSLT stylesheets that use variable names must define the variables.
libxml appears to be too permissive and allow broken stylesheets, while
JRuby-Nokogiri correctly fails to parse the stylesheet.

For details, see the following analysis of this issue by @headius:
sparklemotion/nokogiri#1078 (comment)

This test still fails due to a missing <?xml> header. For details, see
sparklemotion/nokogiri#1079
@denisdefreyne
Copy link
Author

Introducing the xsl:variable does the trick.

In the nanoc test case, which was a bit larger than the one provided in this issue, I invoke the XSLT with an external param (similar to xslt.transform(doc, ['key', 'value']) given in the Nokogiri::XSLT::Stylesheet documentation). Swapping the xsl:variable with xsl:param works in this case.

It’s good to know that libxml/libxslt is occasionally too permissive.

Thanks for investigating!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants