Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on XLS format 59 #179

Closed
gleporeNARA opened this issue Dec 27, 2019 · 3 comments · Fixed by #195
Closed

Crash on XLS format 59 #179

gleporeNARA opened this issue Dec 27, 2019 · 3 comments · Fixed by #195
Assignees
Labels
bug A product defect that needs fixing P2 Medium priority issues to be scheduled in a future release
Milestone

Comments

@gleporeNARA
Copy link

gleporeNARA commented Dec 27, 2019

I get the following crash while examining a sample of what I believe to be format 59, Microsoft XLS (as reported by Siegfried software).

fmt_59_Microsoft_Excel_5.0-95_Workbook_(xls)_EXCEL_95.zip

fido *
FIDO v1.4.0 (formats-v95.xml, container-signature-20180920.xml, format_extensions.xml)
Traceback (most recent call last):
  File "/usr/local/bin/fido", line 11, in <module>
    load_entry_point('opf-fido==1.4.0', 'console_scripts', 'fido')()
  File "/usr/local/lib/python3.6/dist-packages/opf_fido-1.4.0-py3.6.egg/fido/fido.py", line 861, in main
    fido.identify_file(file, extension=not args.noextension)
  File "/usr/local/lib/python3.6/dist-packages/opf_fido-1.4.0-py3.6.egg/fido/fido.py", line 362, in identify_file
    container_matches = self.match_container("OLE2", OlePackage, filename, container_file)
  File "/usr/local/lib/python3.6/dist-packages/opf_fido-1.4.0-py3.6.egg/fido/fido.py", line 211, in match_container
    puids = klass(file, self.extract_signatures(signature_file, signature_type=signature_type)).detect_formats()
  File "/usr/local/lib/python3.6/dist-packages/opf_fido-1.4.0-py3.6.egg/fido/package.py", line 40, in detect_formats
    with olefile.OleFileIO(self.ole) as ole:
AttributeError: __enter__
@ross-spencer
Copy link

NB. There's a crash on Tika 1.23 for this file too which @tballison might be interested in.

Apache Tika was unable to parse the document
at /home/ross-spencer/Desktop/temp/fmt_59_Microsoft_Excel_5.0-95_Workbook_.xls._EXCEL_95/fmt_59_Microsoft_Excel_5.0-95_Workbook_(xls)_EXCEL_95.xls.

The full exception stack trace is included below:

org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@13a700b2
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
	at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:188)
	at org.apache.tika.parser.DigestingParser.parse(DigestingParser.java:84)
	at org.apache.tika.gui.TikaGUI.handleStream(TikaGUI.java:358)
	at org.apache.tika.gui.TikaGUI.openFile(TikaGUI.java:309)
	at org.apache.tika.gui.ParsingTransferHandler.importFiles(ParsingTransferHandler.java:94)
	at org.apache.tika.gui.ParsingTransferHandler.importData(ParsingTransferHandler.java:77)
	at javax.swing.TransferHandler.importData(TransferHandler.java:827)
	at javax.swing.TransferHandler$DropHandler.drop(TransferHandler.java:1544)
	at java.awt.dnd.DropTarget.drop(DropTarget.java:455)
	at javax.swing.TransferHandler$SwingDropTarget.drop(TransferHandler.java:1282)
	at sun.awt.dnd.SunDropTargetContextPeer.processDropMessage(SunDropTargetContextPeer.java:538)
	at sun.awt.X11.XDropTargetContextPeer.processDropMessage(XDropTargetContextPeer.java:184)
	at sun.awt.dnd.SunDropTargetContextPeer$EventDispatcher.dispatchDropEvent(SunDropTargetContextPeer.java:852)
	at sun.awt.dnd.SunDropTargetContextPeer$EventDispatcher.dispatchEvent(SunDropTargetContextPeer.java:776)
	at sun.awt.dnd.SunDropTargetEvent.dispatch(SunDropTargetEvent.java:48)
	at java.awt.Component.dispatchEventImpl(Component.java:4744)
	at java.awt.Container.dispatchEventImpl(Container.java:2297)
	at java.awt.Component.dispatchEvent(Component.java:4711)
	at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4904)
	at java.awt.LightweightDispatcher.processDropTargetEvent(Container.java:4609)
	at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4471)
	at java.awt.Container.dispatchEventImpl(Container.java:2283)
	at java.awt.Window.dispatchEventImpl(Window.java:2746)
	at java.awt.Component.dispatchEvent(Component.java:4711)
	at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:760)
	at java.awt.EventQueue.access$500(EventQueue.java:97)
	at java.awt.EventQueue$3.run(EventQueue.java:709)
	at java.awt.EventQueue$3.run(EventQueue.java:703)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
	at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:84)
	at java.awt.EventQueue$4.run(EventQueue.java:733)
	at java.awt.EventQueue$4.run(EventQueue.java:731)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
	at java.awt.EventQueue.dispatchEvent(EventQueue.java:730)
	at org.GNOME.Accessibility.AtkWrapper$6.dispatchEvent(AtkWrapper.java:715)
	at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:205)
	at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
	at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
	at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
	at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
	at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)
Caused by: java.lang.NullPointerException
	at com.sun.org.apache.xml.internal.serializer.ToHTMLStream.endElement(ToHTMLStream.java:911)
	at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerHandlerImpl.endElement(TransformerHandlerImpl.java:284)
	at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
	at org.apache.tika.gui.TikaGUI$2.endElement(TikaGUI.java:581)
	at org.apache.tika.sax.TeeContentHandler.endElement(TeeContentHandler.java:94)
	at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
	at org.apache.tika.sax.SecureContentHandler.endElement(SecureContentHandler.java:256)
	at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
	at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
	at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
	at org.apache.tika.sax.SafeContentHandler.endElement(SafeContentHandler.java:274)
	at org.apache.tika.sax.XHTMLContentHandler.endDocument(XHTMLContentHandler.java:229)
	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:147)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
	... 45 more

@carlwilson carlwilson added bug A product defect that needs fixing P2 Medium priority issues to be scheduled in a future release labels May 5, 2020
@carlwilson carlwilson added this to the v1.6 milestone May 5, 2020
@carlwilson
Copy link
Member

Fixed by #195 v1.6

@tballison
Copy link

For kicks, I confirmed we're good on Tika now and at least back to 1.26 for this file. :D. Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A product defect that needs fixing P2 Medium priority issues to be scheduled in a future release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants