Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for converting .msg files to Documents #8777

Open
sjrl opened this issue Jan 28, 2025 · 0 comments
Open

Add support for converting .msg files to Documents #8777

sjrl opened this issue Jan 28, 2025 · 0 comments
Labels
P2 Medium priority, add to the next sprint if no P1 available type:feature New feature or request

Comments

@sjrl
Copy link
Contributor

sjrl commented Jan 28, 2025

Is your feature request related to a problem? Please describe.
Recently we have had more clients want to be able to use .msg files in their RAG pipelines. The .msg format is a Microsoft email format and is not trivial to convert without the help of an external library.

Describe the solution you'd like
It would be great if we could add a MSGToDocument converter to Haystack.

Additional context
Some libraries I researched that could help with this are:

python-oxmsg (comes from the same dev we use for our PPTXToDocument converter)

msg-extractor (actively maintained but has a GPL-3.0 license)

@sjrl sjrl added the type:feature New feature or request label Jan 29, 2025
@julian-risch julian-risch added the P2 Medium priority, add to the next sprint if no P1 available label Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 Medium priority, add to the next sprint if no P1 available type:feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants