Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solve JSONDecodeError #799

Merged
merged 8 commits into from
Jan 28, 2024
Merged

Solve JSONDecodeError #799

merged 8 commits into from
Jan 28, 2024

Conversation

HuZixia
Copy link
Contributor

@HuZixia HuZixia commented Jan 26, 2024

Solve JSONDecodeError, about issue #749

To avoid JSONDecodeError:

Remove comments in output json str, after json value content, maybe start with #, maybe start with //, particularly, it is not inside the string value

Addtionly, if you do not want JSONDecodeError to occur, you can add 'Delete comments in json' after FORMAT_CONSTRAINT in action_node.py

Features

The json content returned by a LLM may contain comments, maybe start with #, maybe start with //, it's random. This can lead to subsequent json parsing errors that affect the overall code execution.
The Error has happened whether I use gpt-4-1106-preview or GLM-4.
These code changes are intended to fix this problem.

Feature Docs

Influence

These code changes are intended to fix JSONDecodeError.

Result

See the issue #749 in detail

The json comments maybe start with #
image

The json comments maybe start with //
image

So we need to fix the code repair_llm_raw_output.py.

Add the second way to fix bug about raised JSONDecodeError:
Add prompt "Delete comments in json" of FORMAT_CONSTRAINT in action_node.py, as follows:

raised JSONDecodeError as follows:
image

add "Delete comments in json", solve the problem:
image

Remove comments in output json str, after json value content, maybe start with #, maybe start with //, particularly, it is not inside the string value

Addtionly, if you do not want JSONDecodeError to occur, you can add 'Delete comments in json' after FORMAT_CONSTRAINT in action_node.py
…Delete comments in json' after FORMAT_CONSTRAINT in action_node.py
@@ -23,7 +23,10 @@
TAG = "CONTENT"

LANGUAGE_CONSTRAINT = "Language: Please use the same language as Human INPUT."
FORMAT_CONSTRAINT = f"Format: output wrapped inside [{TAG}][/{TAG}] like format example, nothing else."
FORMAT_CONSTRAINT = (f"Format: output wrapped inside [{TAG}][/{TAG}] like format example, nothing else. "
f"Delete comments in json")
Copy link
Collaborator

@better629 better629 Jan 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

due to ActionNode can generate json or markdown format example, we suggest not to add explicit json keyword in template str. Can the new added code in repair_llm_xx solved the problem, If so, maybe it's no need to add here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I agree with you.

FORMAT_CONSTRAINT = (f"Format: output wrapped inside [{TAG}][/{TAG}] like format example, nothing else. "
f"Delete comments in json")
# Delete comments in json
# If you don't want JSONDecodeError to occur, you can add Delete comments in json after FORMAT_CONSTRAINT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems no need to add this explanation in extra lines.

@@ -198,6 +214,12 @@ def repair_invalid_json(output: str, error: str) -> str:
new_line = line.replace("}", "")
elif line.endswith("},") and output.endswith("},"):
new_line = line[:-1]
# remove comments in output json str, after json value content, maybe start with #, maybe start with //
elif rline[col_no] == "#" or rline[col_no] == "/":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you have removed comments in the repair pipeline in repair_json_format , there is no need to do it again here.

@@ -105,6 +105,23 @@ def judge_potential_json(routput: str, left_key: str) -> Union[str, None]:
return output


def remove_comments_from_line(line):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add unittest case for // like https://github.com/geekan/MetaGPT/blob/main/tests/metagpt/utils/test_repair_llm_raw_output.py#L131-L141

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modify code based on feedback of action_node.py and repair_llm_raw_output.py, add code in test_repair_llm_raw_output.py. Please have a look, thanks.

…tput.py, add code in test_repair_llm_raw_output.py
@better629
Copy link
Collaborator

LGTM

Copy link
Owner

@geekan geekan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@geekan geekan merged commit ee0801a into geekan:main Jan 28, 2024
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants