Bug: __nameid_ directory should not be parsed (and causing invalid HTML body) #23

derrohrbach · 2020-01-22T12:12:10Z

As discussed in PR #22, see also nameid-fix branch:

In a MSG file the _nameid... directory represents named properties, which are as of right now not supported by this library. It tries to parse the entries in this directory using the normal property parser though, which creates conflicts and is wrong. You can also refer to the Microsoft documentation for this: [MS-OXMSG].

I had several emails where an id in this directory collided with the property id for HTML-Content (10130102), which caused RTF-Emails to incorrctly report this invalid content as the HTML body, which prevented the RTF Conversion from being read.

derrohrbach · 2020-01-22T12:14:46Z

The fix in #22 could not be merged because of Tests failing! I can not reproduce this though, because my tests always fail with this error:

[ERROR] Tests run: 16, Failures: 0, Errors: 16, Skipped: 0, Time elapsed: 0.07 s <<< FAILURE! - in org.simplejavamail.outlookmessageparser.HighoverEmailsTest
[ERROR] testUnicodeMessage(org.simplejavamail.outlookmessageparser.HighoverEmailsTest)  Time elapsed: 0.009 s  <<< ERROR!
java.lang.Error:
Unresolved compilation problems:
        OutlookMessageAssert cannot be resolved
        OutlookMessageAssert cannot be resolved
        OutlookMessageAssert cannot be resolved
        OutlookMessageAssert cannot be resolved
        The method normalizeText(String) is undefined for the type HighoverEmailsTest

        at org.simplejavamail.outlookmessageparser.HighoverEmailsTest.testUnicodeMessage(HighoverEmailsTest.java:520)

bbottema · 2020-01-22T12:15:49Z

Ahh, those are classes generated by a maven plugin when you run mvn test

derrohrbach · 2020-01-22T12:20:35Z

Well it seems for me those are not being generated, because i run using mvn test

derrohrbach · 2020-01-24T08:24:43Z

Are there any news about these errors? I would really like to run the tests and fix the bugs ;)

bbottema · 2020-01-24T08:37:49Z

I really don't have a clue why mvn test wouldn't work for you. Except maybe that you're JDK is too new? I'm developing this library with JDK 1.8 (and Maven 3.6.0).

derrohrbach · 2020-01-24T08:46:13Z

Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: C:\Maven\bin\..
Java version: 1.8.0_222, vendor: AdoptOpenJDK, runtime: C:\Program Files\AdoptOpenJDK\jdk-8.0.222.10-hotspot\jre
Default locale: de_DE, platform encoding: Cp1252
OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"

derrohrbach · 2020-01-24T08:53:38Z

I got something running by copying the class files from target/test-classes to target/classes... Now trying to look into the tests

bbottema · 2020-01-24T09:13:12Z

That's just bizar that you have to copy anything, I have no explanation for it.

derrohrbach · 2020-01-24T09:22:55Z

Well a second run of the tests does not work either and fails with an error about something being already instrumented. My current procedure for testing:

mvn clean test
copy files
mvn test

I was able to verify though that it's an error in the tests and not in the fixed code. It always fails at getBodyHTML().isNotEmpty(), because before you had the invalid body content in there. Now it is null, which is correct, because there is no HTML body in these four mails.

To verify this just look at the supposed HTML content which the current code produces for the following mails:

plain chain.msg
chinese message.msg
forward with attachments and embedded images.msg
nested simple mail.msg

You will see that the content is just garbled data. That is essentially the bug, which the patch fixed.
I will open a new pull request with the tests fixed.

bbottema · 2020-01-24T09:57:44Z

Thanks for the research, much appreciated. Before I can merge the change, I would like to investigate why body HTML remains empty even if there is an RTF body. I'll be able to check that this weekend.

derrohrbach · 2020-01-24T10:00:47Z

Well thats a different issue though, isn't it? The behaviour with this bugfix is better than before, because you can at least see, that there is no HTML Body, instead of getting invalid data. Also how would you detect if it is a real html email or if it was converted from rtf, if the html attribute is always set?

bbottema · 2020-01-24T11:37:43Z

Indeed and I wasn't looking properly (it's been some time since I touched that part of the code): the converted HTML is there, but in msg.getConvertedBodyHTML() instead of msg.getBodyHTML(). so that's all hunky-dory. I'll prepare a release soon.

bbottema · 2020-01-24T12:09:06Z

Released in 1.7.1

derrohrbach changed the title ~~Bugfix: __nameid_ directory should not be parsed~~ Bug: __nameid_ directory should not be parsed Jan 22, 2020

derrohrbach mentioned this issue Jan 22, 2020

Bugfix: __nameid_ directory should not be parsed #22

Merged

bbottema added help wanted need user input labels Jan 24, 2020

derrohrbach mentioned this issue Jan 24, 2020

Bugfix: __nameid_ directory should not be parsed #24

Merged

Faelean mentioned this issue Jan 24, 2020

OutlookMessage.getClientSubmitTime produces NullPointerException bbottema/simple-java-mail#243

Closed

bbottema closed this as completed Jan 24, 2020

bbottema mentioned this issue Jan 24, 2020

NPE on ClientSubmitTime when original message has not been sent yet #25

Closed

bbottema added this to the 1.7.1 milestone Jan 24, 2020

bbottema changed the title ~~Bug: __nameid_ directory should not be parsed~~ Bug: __nameid_ directory should not be parsed (also causing invalid HTML body) Jan 24, 2020

bbottema changed the title ~~Bug: __nameid_ directory should not be parsed (also causing invalid HTML body)~~ Bug: __nameid_ directory should not be parsed (and causing invalid HTML body) Jan 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: __nameid_ directory should not be parsed (and causing invalid HTML body) #23

Bug: __nameid_ directory should not be parsed (and causing invalid HTML body) #23

derrohrbach commented Jan 22, 2020

derrohrbach commented Jan 22, 2020

bbottema commented Jan 22, 2020 •

edited

Loading

derrohrbach commented Jan 22, 2020

derrohrbach commented Jan 24, 2020

bbottema commented Jan 24, 2020 •

edited

Loading

derrohrbach commented Jan 24, 2020

derrohrbach commented Jan 24, 2020

bbottema commented Jan 24, 2020

derrohrbach commented Jan 24, 2020 •

edited

Loading

bbottema commented Jan 24, 2020

derrohrbach commented Jan 24, 2020

bbottema commented Jan 24, 2020

bbottema commented Jan 24, 2020

Bug: __nameid_ directory should not be parsed (and causing invalid HTML body) #23

Bug: __nameid_ directory should not be parsed (and causing invalid HTML body) #23

Comments

derrohrbach commented Jan 22, 2020

derrohrbach commented Jan 22, 2020

bbottema commented Jan 22, 2020 • edited Loading

derrohrbach commented Jan 22, 2020

derrohrbach commented Jan 24, 2020

bbottema commented Jan 24, 2020 • edited Loading

derrohrbach commented Jan 24, 2020

derrohrbach commented Jan 24, 2020

bbottema commented Jan 24, 2020

derrohrbach commented Jan 24, 2020 • edited Loading

bbottema commented Jan 24, 2020

derrohrbach commented Jan 24, 2020

bbottema commented Jan 24, 2020

bbottema commented Jan 24, 2020

bbottema commented Jan 22, 2020 •

edited

Loading

bbottema commented Jan 24, 2020 •

edited

Loading

derrohrbach commented Jan 24, 2020 •

edited

Loading