Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

olevba: prevent side effects on python lib "email" #604

Merged
merged 2 commits into from
Sep 3, 2020

Conversation

matthieuxyz
Copy link
Contributor

Fix issue #602

(Sorry for previous pull request, I committed with the wrong email, tried to fix it and ended up messing up the whole repository. This PR should be alright.)

Copy link
Owner

@decalage2 decalage2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thanks. Just a question: why do you use copy() instead of just storing the original email.feedparser.headerRE?

@matthieuxyz
Copy link
Contributor Author

matthieuxyz commented Sep 3, 2020

Copy isn't needed, it actually raise an exception as you can't copy a pattern object, but it was masking another exception from email.message_from_bytes() in python3. With copy the exception was soon enough to not patch the regex (and thus no problem to use email later), in the second case the regex was patched, the exception raised and not patched back (and using email lib failed).

Using a try.. finally clause seems more reasonable, to keep the old regex in case of exception.

Now... for the reason of the exception in python3. It seems that the fix for #32 only works on python2.7 as python3 added an additional assert in feedparser.py at line 524. This same exception is actually the reason why I want the regex to be patched back to what is originally was.

Using the version 0.55.1 from pypi, without my fix, on python3 on the sample from issue #32 give me this error:

olevba 0.55.1 on Python 3.6.9 - http://decalage.info/python/oletools
INFO     Opening MHTML file ./test/data/sample_with_invalid_header.mht
INFO     Failed MIME parsing for file './test/data/sample_with_invalid_header.mht' - Please report this issue on https://github.com/decalage2/oletools/issues
DEBUG    Trace:
Traceback (most recent call last):
  File "/home/matthieu/.local/lib/python3.6/site-packages/oletools/olevba.py", line 2940, in open_mht
    mhtml = email.message_from_bytes(stripped_data)
  File "/usr/lib/python3.6/email/__init__.py", line 46, in message_from_bytes
    return BytesParser(*args, **kws).parsebytes(s)
  File "/usr/lib/python3.6/email/parser.py", line 124, in parsebytes
    return self.parser.parsestr(text, headersonly)
  File "/usr/lib/python3.6/email/parser.py", line 68, in parsestr
    return self.parse(StringIO(text), headersonly=headersonly)
  File "/usr/lib/python3.6/email/parser.py", line 57, in parse
    feedparser.feed(data)
  File "/usr/lib/python3.6/email/feedparser.py", line 176, in feed
    self._call_parse()
  File "/usr/lib/python3.6/email/feedparser.py", line 180, in _call_parse
    self._parse()
  File "/usr/lib/python3.6/email/feedparser.py", line 385, in _parsegen
    for retval in self._parsegen():
  File "/usr/lib/python3.6/email/feedparser.py", line 240, in _parsegen
    self._parse_headers(headers)
  File "/usr/lib/python3.6/email/feedparser.py", line 524, in _parse_headers
    assert i>0, "_parse_headers fed line with no : and no leading WS"
AssertionError: _parse_headers fed line with no : and no leading WS

It's basically the same trace I get from issue #602

So at the moment this PR does exactly what is needed to not have side effects on the email lib, but it still doesn't work on the sample from issue #32

@decalage2
Copy link
Owner

Indeed, I confirm that the current dev version (without this PR) manages to parse the sample from issue #32 correctly on Python 2.7, but fails on Python 3. I quickly checked other MHT samples on Python 3, it seems to work, so it's only an issue with the sample from #32 and the monkeypatch.
I need to find another workaround. I will merge this PR, and open a new issue similar to #32 for Python 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A monkey patch in olevba is causing bugs in other part of code unrelated to oletools
2 participants