Skip to content

Commit

Permalink
Implement error handling in Nokogiri parser
Browse files Browse the repository at this point in the history
If a parse error occurred anywhere inside the XML, the Nokogiri parser just returns whatever happens to be on top of the stack. This makes it very hard to troubleshoot the error down the line.

To avoid this we need to implement the `error` method and raise an error. This introduces a new exception type Nori::ParseError which we also need to raise in the REXML parser to make it symmetric.
  • Loading branch information
stenlarsson committed Apr 5, 2024
1 parent dbbd948 commit 29c145b
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 1 deletion.
1 change: 1 addition & 0 deletions lib/nori.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
require "nori/xml_utility_node"

class Nori
class ParseError < StandardError; end

def self.hash_key(name, options = {})
name = name.tr("-", "_") if options[:convert_dashes_to_underscores]
Expand Down
5 changes: 5 additions & 0 deletions lib/nori/parser/nokogiri.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ module Nokogiri

class Document < ::Nokogiri::XML::SAX::Document
attr_accessor :options
attr_accessor :last_error

def stack
@stack ||= []
Expand Down Expand Up @@ -44,13 +45,17 @@ def characters(string)

alias cdata_block characters

def error(message)
@last_error = message
end
end

def self.parse(xml, options)
document = Document.new
document.options = options
parser = ::Nokogiri::XML::SAX::Parser.new document
parser.parse xml
raise ParseError, document.last_error if document.last_error
document.stack.length > 0 ? document.stack.pop.to_hash : {}
end

Expand Down
6 changes: 5 additions & 1 deletion lib/nori/parser/rexml.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,11 @@ def self.parse(xml, options)
parser = ::REXML::Parsers::BaseParser.new(xml)

while true
raw_data = parser.pull
begin
raw_data = parser.pull
rescue ::REXML::ParseException => error
raise Nori::ParseError, error.message
end
event = unnormalize(raw_data)
case event[0]
when :end_document
Expand Down
4 changes: 4 additions & 0 deletions spec/nori/nori_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -640,6 +640,10 @@
expect(parse(' ')).to eq({})
end

it "raises error on missing end tag" do
expect { parse('<foo><bar>foo bar</foo>') }.to raise_error(Nori::ParseError)
end

end
end

Expand Down

0 comments on commit 29c145b

Please sign in to comment.