Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: structured_predict returns error string during pydantic validation fail #17649

Closed
alexander-azizi-martin opened this issue Jan 27, 2025 · 6 comments · Fixed by #17663
Closed
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@alexander-azizi-martin
Copy link
Contributor

Bug Description

When a validation error from pydantic occurs, structured_predict will simply return the error as a string rather than raising it. The following example is for illustrative purposes. If the response from openai does not contain two elements in the people list, then a string will be returned from structured_predict rather than a PersonList object.

from pydantic import BaseModel, Field
from llama_index.core import PromptTemplate
from llama_index.llms.openai import OpenAI

class PersonList(BaseModel):
  people: list[str] = Field(
    description="A list of names.",
    min_length=2,
    max_length=2,
  )

llm = OpenAI()
llm.structured_predict(PersonList, PromptTemplate('Extract the name of the two individuals listed: alexander, martin'))

This is because within the FunctionCallingProgram class, it casts the AgentChatResponse to the pydantic model without checking whether there is an error associated with the ToolOutput:

agent_response = self._llm.predict_and_call(
[tool],
chat_history=messages,
verbose=self._verbose,
allow_parallel_tool_calls=self._allow_parallel_tool_calls,
**llm_kwargs,
)
return _parse_tool_outputs(
agent_response,
allow_parallel_tool_calls=self._allow_parallel_tool_calls,
) # type: ignore

def _parse_tool_outputs(
agent_response: AgentChatResponse,
allow_parallel_tool_calls: bool = False,
) -> Union[BaseModel, List[BaseModel]]:
"""Parse tool outputs."""
outputs = [cast(BaseModel, s.raw_output) for s in agent_response.sources]
if allow_parallel_tool_calls:
return outputs
else:
if len(outputs) > 1:
_logger.warning(
"Multiple outputs found, returning first one. "
"If you want to return all outputs, set output_multiple=True."
)
return outputs[0]

When the tool is called and there is an error it simply serializes the error and sets is_error to true on the ToolOutput but then doesn't use this value to raise an error after the fact.

def call_tool(tool: BaseTool, arguments: dict) -> ToolOutput:
"""Call a tool with arguments."""
try:
if (
len(tool.metadata.get_parameters_dict()["properties"]) == 1
and len(arguments) == 1
):
try:
single_arg = arguments[next(iter(arguments))]
return tool(single_arg)
except Exception:
# some tools will REQUIRE kwargs, so try it
return tool(**arguments)
else:
return tool(**arguments)
except Exception as e:
return ToolOutput(
content="Encountered error: " + str(e),
tool_name=tool.metadata.name,
raw_input=arguments,
raw_output=str(e),
is_error=True,
)

This is unexpected behaviour. I'd assume that if there is a pydantic error it would raise it. Instead I have to check for every output whether the returned value is an instance of a string or the pydantic class to know if an error occured.

I have faced this issues with the use of 4o-mini more compared to other open ai models, as it is not very good at consistently following specific instructions. So I'd often face pydantic validation errors and be suprised when the error i get is something along the lines of str object does not have property people.

Further guidance on this would be appreciated. I can possibly fix this if it is something which needs fixing.

Version

0.12.12

Steps to Reproduce

Described above.

Relevant Logs/Tracbacks

@alexander-azizi-martin alexander-azizi-martin added bug Something isn't working triage Issue needs to be triaged/prioritized labels Jan 27, 2025
Copy link

dosubot bot commented Jan 27, 2025

Hey @alexander-azizi-martin! I'm here to help you out with any bugs, questions, or contributions you have!

The issue you're encountering with the structured_predict method in the FunctionCallingProgram class is due to the way errors are handled in the ToolOutput class. When a Pydantic validation error occurs, the error is serialized and the is_error flag is set to True, but the error is not raised as an exception. Instead, the error is returned as a string, which requires you to manually check the type of the returned value to determine if an error occurred.

To address this, you could modify the error handling logic to raise exceptions for Pydantic validation errors instead of returning them as strings. This would involve checking the is_error flag in the ToolOutput and raising the appropriate exception when it is set to True.

This behavior is particularly problematic with the 4o-mini model, which may not consistently follow instructions, leading to frequent validation errors. Improving the error handling to raise exceptions would provide a more consistent and expected behavior when using this model or any other model that might produce validation errors.

If you're considering fixing this, you would need to adjust the code to ensure that validation errors are properly propagated as exceptions. This would likely involve changes in the FunctionCallingProgram class to check for the is_error flag and raise the error accordingly.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@logan-markewich
Copy link
Collaborator

I think the intention is that you (the user) check the output

The reason it is this way right now is that it makes it easier for outer components to handle (putting the output directly in some refine logic in a query engine or chat history in an agent)

@alexander-azizi-martin
Copy link
Contributor Author

alexander-azizi-martin commented Jan 27, 2025

@logan-markewich Could a flag be added to the call in order to throw on error?

@logan-markewich
Copy link
Collaborator

@alexander-azizi-martin that's definitely an option!

@alexander-azizi-martin
Copy link
Contributor Author

Ok will work on a PR for this then.

@alexander-azizi-martin
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
2 participants