Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

我按照readme部署了项目,但似乎出现了json解码器的错误,该如何解决 #311

Closed
Simonabcddcba opened this issue Aug 3, 2024 · 2 comments
Labels
help wanted Extra attention is needed

Comments

@Simonabcddcba
Copy link

Description of the bug | 错误描述

magic-pdf, version 0.6.2b1,下载了正确的模型文件并将路径添加到json中,也将json文件复制到用户目录并改名,我不知道什么导致这个错误以及该如何修复

How to reproduce the bug | 如何复现

(MinerU) E:\UsefulTool\MinerU-magic_pdf-0.6.2b1-released>magic-pdf pdf-command --pdf "D:\1.pdf" --inside_model true
2024-08-03 09:57:55.458 | ERROR | magic_pdf.cli.magicpdf:parse_doc:338 - Invalid control character at: line 6 column 29 (char 156)
Traceback (most recent call last):

File "C:\Users\Beluga.conda\envs\MinerU\lib\runpy.py", line 196, in _run_module_as_main
return run_code(code, main_globals, None,
│ │ └ {'name': 'main', 'doc': None, 'package': '', 'loader': <zipimporter object "C:\Users\Beluga.conda\envs\M...
│ └ <code object at 0x000001C184D37EC0, file "C:\Users\Beluga.conda\envs\MinerU\Scripts\magic-pdf.exe_main.py", lin...
└ <function _run_code at 0x000001C184D21480>

File "C:\Users\Beluga.conda\envs\MinerU\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
│ └ {'name': 'main', 'doc': None, 'package': '', 'loader': <zipimporter object "C:\Users\Beluga.conda\envs\M...
└ <code object at 0x000001C184D37EC0, file "C:\Users\Beluga.conda\envs\MinerU\Scripts\magic-pdf.exe_main.py", lin...

File "C:\Users\Beluga.conda\envs\MinerU\Scripts\magic-pdf.exe_main_.py", line 7, in

File "C:\Users\Beluga.conda\envs\MinerU\lib\site-packages\click\core.py", line 1157, in call
return self.main(*args, **kwargs)
│ │ │ └ {}
│ │ └ ()
│ └ <function BaseCommand.main at 0x000001C185186680>

File "C:\Users\Beluga.conda\envs\MinerU\lib\site-packages\click\core.py", line 1078, in main
rv = self.invoke(ctx)
│ │ └ <click.core.Context object at 0x000001C184D88FA0>
│ └ <function MultiCommand.invoke at 0x000001C185187640>

File "C:\Users\Beluga.conda\envs\MinerU\lib\site-packages\click\core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
│ │ │ │ └ <click.core.Context object at 0x000001C1DBAAB2E0>
│ │ │ └ <function Command.invoke at 0x000001C185187130>
│ │ └
│ └ <click.core.Context object at 0x000001C1DBAAB2E0>
└ <function MultiCommand.invoke.._process_result at 0x000001C184AA7D00>

File "C:\Users\Beluga.conda\envs\MinerU\lib\site-packages\click\core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
│ │ │ │ │ └ {'pdf': 'D:\1.pdf', 'inside_model': True, 'model': None, 'method': 'auto', 'model_mode': 'full'}
│ │ │ │ └ <click.core.Context object at 0x000001C1DBAAB2E0>
│ │ │ └ <function pdf_command at 0x000001C1DBAD4B80>
│ │ └
│ └ <function Context.invoke at 0x000001C185185EA0>
└ <click.core.Context object at 0x000001C1DBAAB2E0>

File "C:\Users\Beluga.conda\envs\MinerU\lib\site-packages\click\core.py", line 783, in invoke
return __callback(*args, **kwargs)
│ └ {'pdf': 'D:\1.pdf', 'inside_model': True, 'model': None, 'method': 'auto', 'model_mode': 'full'}
└ ()

File "C:\Users\Beluga.conda\envs\MinerU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 352, in pdf_command
parse_doc(pdf)
│ └ 'D:\1.pdf'
└ <function pdf_command..parse_doc at 0x000001C1DBAD48B0>

File "C:\Users\Beluga.conda\envs\MinerU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 328, in parse_doc
jso = json_parse.loads(get_model_json(model, doc_path))
│ │ │ │ └ 'D:\1.pdf'
│ │ │ └ None
│ │ └ <function pdf_command..get_model_json at 0x000001C1DBAD4670>
│ └ <function loads at 0x000001C184D9C820>
└ <module 'json' from 'C:\Users\Beluga.conda\envs\MinerU\lib\json\init.py'>

File "C:\Users\Beluga.conda\envs\MinerU\lib\json_init_.py", line 346, in loads
return _default_decoder.decode(s)
│ │ └ '{\n "bucket_info":{\n "bucket-name-1":["ak", "sk", "endpoint"],\n "bucket-name-2":["ak", "sk", "endpoint"]...
│ └ <function JSONDecoder.decode at 0x000001C184D9C0D0>
└ <json.decoder.JSONDecoder object at 0x000001C184D89000>

File "C:\Users\Beluga.conda\envs\MinerU\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
│ │ │ │ └ '{\n "bucket_info":{\n "bucket-name-1":["ak", "sk", "endpoint"],\n "bucket-name-2":["ak", "sk", "endpoint"]...
│ │ │ └ <built-in method match of re.Pattern object at 0x000001C184963510>
│ │ └ '{\n "bucket_info":{\n "bucket-name-1":["ak", "sk", "endpoint"],\n "bucket-name-2":["ak", "sk", "endpoint"]...
│ └ <function JSONDecoder.raw_decode at 0x000001C184D9C160>
└ <json.decoder.JSONDecoder object at 0x000001C184D89000>

File "C:\Users\Beluga.conda\envs\MinerU\lib\json\decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
│ │ │ └ 0
│ │ └ '{\n "bucket_info":{\n "bucket-name-1":["ak", "sk", "endpoint"],\n "bucket-name-2":["ak", "sk", "endpoint"]...
│ └ <_json.Scanner object at 0x000001C184D760E0>
└ <json.decoder.JSONDecoder object at 0x000001C184D89000>

json.decoder.JSONDecodeError: Invalid control character at: line 6 column 29 (char 156)

Operating system | 操作系统

Windows

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.6.x

Device mode | 设备模式

cuda

@Simonabcddcba Simonabcddcba added the bug Something isn't working label Aug 3, 2024
@myhloli
Copy link
Collaborator

myhloli commented Aug 3, 2024

windows系统中此路径应包含盘符,且需把路径中所有的""替换为"/",否则会因为转义原因导致json文件语法错误。

例如:模型放在D盘根目录的models目录,则model-dir的值应为"D:/models"

@Simonabcddcba
Copy link
Author

感谢,已解决

@myhloli myhloli added help wanted Extra attention is needed and removed bug Something isn't working labels Aug 3, 2024
@myhloli myhloli closed this as completed Aug 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants