-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: __nameid_ directory should not be parsed (and causing invalid HTML body) #23
Comments
The fix in #22 could not be merged because of Tests failing! I can not reproduce this though, because my tests always fail with this error:
|
Ahh, those are classes generated by a maven plugin when you run |
Well it seems for me those are not being generated, because i run using |
Are there any news about these errors? I would really like to run the tests and fix the bugs ;) |
I really don't have a clue why |
|
I got something running by copying the class files from target/test-classes to target/classes... Now trying to look into the tests |
That's just bizar that you have to copy anything, I have no explanation for it. |
Well a second run of the tests does not work either and fails with an error about something being already instrumented. My current procedure for testing:
I was able to verify though that it's an error in the tests and not in the fixed code. It always fails at getBodyHTML().isNotEmpty(), because before you had the invalid body content in there. Now it is null, which is correct, because there is no HTML body in these four mails. To verify this just look at the supposed HTML content which the current code produces for the following mails:
You will see that the content is just garbled data. That is essentially the bug, which the patch fixed. |
Thanks for the research, much appreciated. Before I can merge the change, I would like to investigate why body HTML remains empty even if there is an RTF body. I'll be able to check that this weekend. |
Well thats a different issue though, isn't it? The behaviour with this bugfix is better than before, because you can at least see, that there is no HTML Body, instead of getting invalid data. Also how would you detect if it is a real html email or if it was converted from rtf, if the html attribute is always set? |
Indeed and I wasn't looking properly (it's been some time since I touched that part of the code): the converted HTML is there, but in |
Released in 1.7.1 |
As discussed in PR #22, see also nameid-fix branch:
In a MSG file the _nameid... directory represents named properties, which are as of right now not supported by this library. It tries to parse the entries in this directory using the normal property parser though, which creates conflicts and is wrong. You can also refer to the Microsoft documentation for this: [MS-OXMSG].
I had several emails where an id in this directory collided with the property id for HTML-Content (10130102), which caused RTF-Emails to incorrctly report this invalid content as the HTML body, which prevented the RTF Conversion from being read.
The text was updated successfully, but these errors were encountered: