-
-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2.7.0b1 pydantic_core.from_json(..., allow_partial=True)
truncates partial strings completely
#82
Comments
String parsing in JSON is extremely complex. We might be able to recover from some end of string situations, but it will require a lot of work. PR welcome to try, but I think this is an optimisation that isn't required for 2.7. I'm not even sure you'd always want partial strings to be included, for example:
|
Same for me: >>> from_json('{"message": "The **Chihuly Garden and Glass** is open at different times throughout the week. Here are', allow_partial=True)
{} |
Hmm, I see what you're saying, but ultimately I don't expect the parsed partial string to be "correct" until it's done streaming, and I want the partial string during streaming so that I can display the chunks as they stream. In particular this leads to poor real-time behavior when working the LLM streaming, namely for really long values. For example, let's say I have the following model: from pydantic import BaseModel, Field
class MyModel(BaseModel):
description: str = Field(
...,
description="A really long, verbose description about pydantic"
) If I try to stream this output, I will have to wait for the entire output to be completed before I actually get anything parsed, ultimately defeating the purpose of partial parsing / streaming in this case at all. If we detect an open string, can we not just parse it as a string as-is by closing it? Does this make it less complex? Or is this too simplified a way of handling things? I can't really think of a case where I would want to allow for partial parsing outside of streaming where the desired behavior would be as originally described; however, we could always expose an additional boolean to let the user decide if they want the more or less strict partial parsing? Would be interested in working on a PR for this once aligned on desired behavior. |
This should be fixed in #101, @willbakst can you take try it out and let me know if it solves your problem. |
@samuelcolvin just took a look, the I'm assuming this will be a similar API change in |
Yes, we'll probably avoid changing the name of the kwarg in pydantic-core to avoid breaking changes. In retrospect maybe it was a mistake therefore to change them in jiter. |
Got it, makes sense. Mostly unaffected by this since I'll just update to use jiter directly |
Yup, I think |
Initial Checks
Description
In the below example, I would expect for the
author
field to still populate into the model as it's being streamed. For example, consider adescription
field where it might be on the longer side. This will only give me the description once it's fully formed and closed, which somewhat defeats the purpose of it being partial here for streaming.Example Code
Python, Pydantic & OS Version
The text was updated successfully, but these errors were encountered: