Get your documents ready for gen AI
-
Updated
Dec 19, 2024 - Python
Get your documents ready for gen AI
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Parse files for optimal RAG
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines
A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROBID, LangChain, listen as podcast. Customize your own pipelines.
A Unified Toolkit for Deep Learning-Based Table Extraction
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Applicant Tracking System (ATS): A powerful platform leveraging generative AI and soft-match algorithms to analyze resumes against job descriptions. Built with React and Node.js, it streamlines hiring insights. Future plans include expanding to investor pitches and other structured documents.
Tool for converting First National Bank (FNB) bank statement PDFs into useful structured data
PhraseSpeaker: Effortlessly dictate specific sections of text files with macOS's text-to-speech. Perfect for navigating and audibly extracting key content from large documents!
Parsing Documents to one datatype (Typescript port of Docling)
Repository for testing and demonstrating the capabilities of Docling for document conversion.
AI-powered Financial Report Analysis Engine
An interactive Streamlit app that translates English text and documents to French, featuring Google Translate API integration and text-to-speech functionality. Includes PDF and Word document translation.
Data Structure and Class to ease Parsing of Complex Documents.
Add a description, image, and links to the document-parsing topic page so that developers can more easily learn about it.
To associate your repository with the document-parsing topic, visit your repo's landing page and select "manage topics."