-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG:OCR推理多页pdf文件时,设置了page_num参数会出现只识别第一页的情况 #10259
Comments
找到问题,PR中,稍等一秒钟。PR链接:#10290 |
修改及测试地址: |
应该是PyMuPDF版本不对造成的,换成1.18.14版试试 |
page_num在初始化一个PaddleOCR实例的时候就确定了,每次调用ocr.ocr page_num根据第一次传入的pdf的确定了。可以每次重新初始化PaddleOCR一个OCR实例? |
每次调用都重新初始化一个实例是非常耗时的,创建实例所需的时间都超过了识别所需的时间,这还怎么用? |
如果每次调用ocr.ocr page_num根据第一次传入的pdf确定了,那么初始化实例时page_num这个参数的意义是什么?这样的操作建议还是修改一下 |
建议尝试下PR,我刚刚看是可以解决问题的,目前已经合入了。
|
以上回答已经充分解答了问题,如果有新的问题欢迎随时提交issue,或者在此条issue下继续回复~ |
我也复现了这个问题,初始化PaddleOCR后,多次输入一个pdf文件,有时会只识别有限的几页 |
我也出现了这个问题,多页的pdf如果连续识别,只能识别第一页 |
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
部分代码:
bug复现:先识别一个单页的pdf,再识别一个多页的pdf,此时多页的pdf只能识别第一页
The text was updated successfully, but these errors were encountered: