Support a document loader using the MarkItDown library #28958
Simon-Stone
started this conversation in
Ideas
Replies: 1 comment
-
I've started working on PR #28960 but I'm running into the challenge that Is there any precedent for how to deal with this? Or am I just SOL until the powers that be decide to drop support for 3.9? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Checked
Feature request
Implement a document loader class that uses the MarkItDown library to convert files into Markdown. From the MarkItDown README:
Motivation
Formatting and layout of documents can convey important context information to the human reader. Using text-based markup languages like Markdown can be very helpful to preserve this information in the text-only representation passed to Large Language Models. The Markitdown library is a great, versatile tool to easily convert different kinds of files into a Markdown representation.
Proposal
I propose a simple
MarkitdownLoader
class derived fromBaseLoader
usingmarkitdown.MarkItDown
under the hood (composition pattern).I will try my hand at this and offer a PR soon.
Beta Was this translation helpful? Give feedback.
All reactions