-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not returning all the required entities defined in the schema #113
Comments
What is it returning? |
instead of returning three entities, it returns two for instance or one. |
Do you mean entity or attribute? Usually one would refer to the ExtractedValues as an entity, and its attributes (e.g., syndication_agent) as an attribute. Are you observing that you're getting instances of extracted values with some attributes missing? (e.g., for syndication_agent?) |
yes, some attributes (the way you define it) such as syndication_agent are missing sometime in the output. I'll put an example here shortly. |
For instance, this is the ExtractedValues class: class ExtractedValues(BaseModel): this is print(ExtractedValues.schema()): {'properties': {'agreement_date': {'description': 'What is the agreement_date?', 'title': 'agreement_date', 'type': 'string'}, 'agreement_name': {'description': 'What is the agreement_name?', 'title': 'agreement_name', 'type': 'string'}, 'governing_law': {'description': 'What is the governing_law?', 'title': 'governing_law', 'type': 'string'}, 'effective_date': {'description': 'What is the effective_date?', 'title': 'effective_date', 'type': 'string'}, 'termination_date': {'description': 'What is the termination_date ?', 'title': 'termination_date', 'type': 'string'}}, 'required': ['agreement_date', 'agreement_name', 'governing_law', 'effective_date', 'termination_date'], 'title': 'ExtractedValues', 'type': 'object'} this is response = runnable.invoke({"text": text, "schema": ExtractedValues.schema()}) {'data': [{'agreement_date': 'September 30, 2020', 'agreement_name': 'Restated Credit Facility Agreement', 'governing_law': 'not specified'}]} missing effective_date and termination_date in the response. |
Hi @shima-khoshraftar, thanks for flagging this. I'm having trouble reproducing the case where the attributes are not present at all. Do you think you could create a minimal example? fwiw I am finding that OpenAI will occasionally return from pydantic import BaseModel, Field
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
openai_function_schema = {
"type": "function",
"function": {
"name": "extractor",
"description": "Extract information matching the given schema.",
"parameters": {
"type": "object",
"properties": {
"data": {
"type": "array",
"items": {
"title": "Person",
"type": "object",
"properties": {
"age": {
"title": "Age",
"description": "The age of the person in years.",
"type": "integer"
},
"name": {
"title": "Name",
"description": "The name of the person.",
"type": "string"
}
},
"required": [
"age",
"name"
]
}
}
},
"required": [
"data"
]
}
}
}
prompt = ChatPromptTemplate.from_messages(
[
("system", "Please extract."),
("human", "Extract from the following text: {text}"),
]
)
model = ChatOpenAI(temperature=0, model_kwargs={"tools": [openai_function_schema]})
(prompt | model).invoke({"text": "My name is Chester."}).additional_kwargs["tool_calls"][0]["function"]["arguments"]
|
Hi @ccurme. Thanks for looking into this. So the issue is, it does not happen all the time as it is with llms. In fact, most of the time it works correctly. But because it can happen, it needs some checking (maybe I can myself do it as a post processing). Not sure if I even send an example you would face it too but I will send one. Thanks. |
You can definitely do a post processing step on the client side using the pydantic schema that you defined! (check pydantic docs, but it shouldn't be too hard to validate client side). You can likely mitigate some of the issues simply by providing examples. Examples tend to help a lot in improving performance! |
Hi,
I faced an issue with langchain-extract. I defined an schema with some required entities(not optional), for instance:
class ExtractedValues(BaseModel):
syndication_agent: str=Field(description='who is the syndication agent?')
agreement_date: str=Field(description='what is the agreement date?')
administrative_agent: str=Field(description='what is the administrative agent?')
then ran the following lines(after lunching the app):
runnable = RemoteRunnable("http://localhost:8000/extract_text/")
response = runnable.invoke({"text": text, "schema": ExtractedValues.schema()})
However, the response does not contain all the entities defined in the ExtractedValues. Have you ever faced this issue? I am wondering if you can help me with that. Thanks.
The text was updated successfully, but these errors were encountered: