Add support for arrays, enums and primitive types #5522

kzu · 2024-10-15T19:44:27Z

Structured outputs in OpenAI require an object schema object. By detecting this situation and wrapping always in a Payload<T>(T Data) record, we significantly improve the developer experience by making the API transparent to that limitation (which might even be a temporary one?).

The approach works for both OpenAI as well as Azure Inference without native structured outputs. In order to signal a wrapped result to the ChatCompletion<T>, we use the AdditionalProperties dictionary with a non-standard $wrapped property which is a typical convention for JSON properties that are not intended for end-user consumption (like $schema).

NOTE: there will be significant conflicts with #5513, so I'll adjust after that merge, if this is deemed a useful addition 🙏

Fixes #5521

Microsoft Reviewers: Open in CodeFlow

Structured outputs in OpenAI require an `object` schema object. By detecting this situation and wrapping always in a `Payload<T>(T Data)` record, we significantly improve the developer experience by making the API transparent to that limitation (which might even be a temporary one?). The approach works for both OpenAI as well as Azure Inference without native structured outputs. In order to signal a wrapped result to the `ChatCompletion<T>`, we use the `AdditionalProperties` dictionary with a non-standard `$wrapped` property which is a typical convention for JSON properties that are not intended for end-user consumption (like $schema). Fixes dotnet#5521

src/Libraries/Microsoft.Extensions.AI/ChatCompletion/ChatCompletion{T}.cs

eiriktsarpalis · 2024-10-16T09:43:05Z

src/Libraries/Microsoft.Extensions.AI/ChatCompletion/ChatClientStructuredOutputExtensions.cs

@@ -40,8 +40,7 @@ public static Task<ChatCompletion<T>> CompleteAsync<T>(
        IList<ChatMessage> chatMessages,
        ChatOptions? options = null,
        bool? useNativeJsonSchema = null,
-        CancellationToken cancellationToken = default)
-        where T : class =>


Regardless of whether we take this change, I'd be in favor of removing this constraint since it isn't a good predictor of the JSON shape of the type. We can instead just fail at runtime depending on the value of the corresponding JsonTypeInfo.Kind property.

Rather than relying on the type system, since a source-generated serializer options would not be able to deal with it.

kzu · 2024-10-16T17:36:39Z

Addressed feedback and improved code a bit.

SteveSandersonMS · 2024-10-16T18:00:28Z

Thanks for your work on this, @kzu!

I'm not yet certain whether or not we'd add this feature, as per the comment at #5521 (comment). However, I do appreciate you've been able to do this without expanding the built-in prompt. And in many ways the end result is similar to what happens if the developer supplies a wrapper class manually.

One of the key questions that would impact whether we want to build this in is how the reliability would compare to if the developer provides their own wrapper class. A wrapper class has the advantage that its property is named semantically to reinforce what value is desired, whereas the auto-wrapper always calls the property data which is less clear.

It might be that it works fine, but like you say in #5521, we're looking for a "pit of success" here and so if the best results come from having a semantically-named property, we'd want to lead developers to do that even if it means a few more lines of code.

When it comes to reliability, I'm not too worried about OpenAI models (GPT 3.5T and later) since they seem to behave pretty solidly with almost any structured-output case, even the models that don't support native structured output. It's much more relevant to benchmark the reliability of small models, such as the 7B/8B parameter Lllama 2/3, Mistral, or phi3:medium. I found it pretty hard to get those to give valid structured output at all [1], since JSON schema is not a good way of describing the output format (examples are much better). Providing them with a JSON schema that's more abstract could add to the challenge.

If you have any quantification about the reliability of smaller models on tasks like "return a single enum value" or "return a single number" (which developers will often want to do) and how manual wrapping compares with autogenerating a wrapper, that could help with making a decision here. In the extreme, we might conclude that smaller models just aren't reliable for this in any case, which would lead to some other strategy around generating a JSON example instead of a JSON schema, and at that time loosen the rules about what value types you can request.

[1] Sidenote: it turns out that changing the prompt augmentation to use User messages instead of System messages makes the Ollama-based models much more compliant about structured output, so we should do that. But it's still hard for them.

kzu · 2024-10-16T18:05:45Z

Fair enough. May I request then that instead of an opaque result ("couldn't convert to schema"), the API instead detects this situation and errors with a meaningful error, stating that a wrapper IS REQUIRED otherwise things won't work? It's a super easy "bug" to hit and the current behavior isn't great.

kzu · 2024-10-16T18:09:28Z

On the semantic help you get when defining your own wrapper, I'd suggest you add this by default in the transform callback, since the description can add a lot of context for the model:

          TransformSchemaNode = (context, node) =>
          {
              var description = context.PropertyInfo?.AttributeProvider?.GetCustomAttributes(typeof(DescriptionAttribute), false)
                  .OfType<DescriptionAttribute>()
                  .FirstOrDefault()?.Description;

              if (description != null)
                  node["description"] = description;

              return node;
          },

Otherwise, you're just relying on the propery names alone. Should I report that as a separate issue?

SteveSandersonMS · 2024-10-17T07:56:06Z

Otherwise, you're just relying on the propery names alone. Should I report that as a separate issue?

That would be great. Thanks!

stephentoub · 2024-10-22T14:08:17Z

@SteveSandersonMS, what would you like to do with this PR?

kzu · 2024-10-22T18:50:27Z

Otherwise, you're just relying on the propery names alone. Should I report that as a separate issue?

That would be great. Thanks!

I see this is already being done after the recent merge:

extensions/src/Libraries/Microsoft.Extensions.AI/Utilities/AIJsonUtilities.Schema.cs

Lines 241 to 245 in 424e974

    
           Type descAttrType = typeof(DescriptionAttribute); 
        
           var descriptionAttribute = 
        
               GetAttrs(descAttrType, ctx.PropertyInfo?.AttributeProvider)?.FirstOrDefault() ?? 
        
               GetAttrs(descAttrType, ctx.PropertyInfo?.AssociatedParameter?.AttributeProvider)?.FirstOrDefault() ?? 
        
               GetAttrs(descAttrType, ctx.TypeInfo.Type)?.FirstOrDefault();

💯

SteveSandersonMS · 2024-10-23T18:03:04Z

Closing so we can continue in #5560

kzu requested a review from a team as a code owner October 15, 2024 19:44

dotnet-policy-service bot assigned kzu Oct 15, 2024

stephentoub added the area-AI label Oct 15, 2024

eiriktsarpalis reviewed Oct 16, 2024

View reviewed changes

src/Libraries/Microsoft.Extensions.AI/ChatCompletion/ChatCompletion{T}.cs Outdated Show resolved Hide resolved

eiriktsarpalis reviewed Oct 16, 2024

View reviewed changes

kzu force-pushed the dev/alltypes branch from 3c37476 to 7871d4f Compare October 16, 2024 17:26

Use low level JSON API to manipulate the wrapper node

4db07a5

Rather than relying on the type system, since a source-generated serializer options would not be able to deal with it.

kzu force-pushed the dev/alltypes branch from 7871d4f to 4db07a5 Compare October 16, 2024 17:26

Simplify and centralize how we read/set the $wrapped value

90631df

kzu force-pushed the dev/alltypes branch from 2d0e853 to 90631df Compare October 16, 2024 17:35

SteveSandersonMS mentioned this pull request Oct 23, 2024

Structured output improvements (continuation of PR 5522) #5560

Merged

SteveSandersonMS closed this Oct 23, 2024

github-actions bot locked and limited conversation to collaborators Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for arrays, enums and primitive types #5522

Add support for arrays, enums and primitive types #5522

kzu commented Oct 15, 2024 •

edited by dotnet-policy-service bot

Loading

eiriktsarpalis Oct 16, 2024

kzu commented Oct 16, 2024

SteveSandersonMS commented Oct 16, 2024

kzu commented Oct 16, 2024

kzu commented Oct 16, 2024 •

edited

Loading

SteveSandersonMS commented Oct 17, 2024

stephentoub commented Oct 22, 2024

kzu commented Oct 22, 2024

SteveSandersonMS commented Oct 23, 2024

Add support for arrays, enums and primitive types #5522

Add support for arrays, enums and primitive types #5522

Conversation

kzu commented Oct 15, 2024 • edited by dotnet-policy-service bot Loading

Microsoft Reviewers: Open in CodeFlow

eiriktsarpalis Oct 16, 2024

Choose a reason for hiding this comment

kzu commented Oct 16, 2024

SteveSandersonMS commented Oct 16, 2024

kzu commented Oct 16, 2024

kzu commented Oct 16, 2024 • edited Loading

SteveSandersonMS commented Oct 17, 2024

stephentoub commented Oct 22, 2024

kzu commented Oct 22, 2024

SteveSandersonMS commented Oct 23, 2024

kzu commented Oct 15, 2024 •

edited by dotnet-policy-service bot

Loading

kzu commented Oct 16, 2024 •

edited

Loading