You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Recently we have had more clients want to be able to use .msg files in their RAG pipelines. The .msg format is a Microsoft email format and is not trivial to convert without the help of an external library.
Describe the solution you'd like
It would be great if we could add a MSGToDocument converter to Haystack.
Additional context
Some libraries I researched that could help with this are:
python-oxmsg (comes from the same dev we use for our PPTXToDocument converter)
Is your feature request related to a problem? Please describe.
Recently we have had more clients want to be able to use
.msg
files in their RAG pipelines. The.msg
format is a Microsoft email format and is not trivial to convert without the help of an external library.Describe the solution you'd like
It would be great if we could add a
MSGToDocument
converter to Haystack.Additional context
Some libraries I researched that could help with this are:
python-oxmsg (comes from the same dev we use for our PPTXToDocument converter)
python-oxmsg
by Unstrucured: https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/partition/msg.pymsg-extractor (actively maintained but has a GPL-3.0 license)
The text was updated successfully, but these errors were encountered: