Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching #14

Open
seang96 opened this issue Nov 30, 2024 · 2 comments
Open

Searching #14

seang96 opened this issue Nov 30, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@seang96
Copy link

seang96 commented Nov 30, 2024

I see this as 2 part, will need OCR for documents that need it.

For searching, I'd suggest two options, the easier one but less featured to my knowledge is using postgres full text search and the other that would likely be against the philosophy of keeping it simple is OpenSearch

@mrmn2 mrmn2 added the enhancement New feature or request label Dec 1, 2024
@mrmn2
Copy link
Owner

mrmn2 commented Dec 5, 2024

I have given this issue some thought. I don't think OCR is in the scope of this project. As it is quite resource it clashes with the philosophy of keeping it simple and minimal. I think paperless-ngx would be a better fit for this use case.

However, I can see myself working on that feature for PDFs that don't require OCR as pypdf, which is already used in this project, can extract text from such files.

PS: If you like PdfDing I would be really happy over a star. Thanks!

@seang96
Copy link
Author

seang96 commented Dec 5, 2024

That is acceptable. Can do OCR out of the scope of the project as well, so this request could narrow to just searching for already text compatible docs.

@seang96 seang96 changed the title OCR / Searching Searching Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants