Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Content-Type body contents are not handled #3377

Closed
andreaTP opened this issue Sep 26, 2023 · 8 comments · Fixed by #3453
Closed

Multiple Content-Type body contents are not handled #3377

andreaTP opened this issue Sep 26, 2023 · 8 comments · Fixed by #3453
Assignees
Labels
enhancement New feature or request generator Issues or improvements relater to generation capabilities. WIP
Milestone

Comments

@andreaTP
Copy link
Contributor

Description

When an OpenAPI description defines endpoints with > 1 content types for the body content, only the first application/jsonish entry is handled by the SDK.

Repro

Given a minimal OpenAPI such as:

openapi: 3.0.0
info:
  title: Test
  version: 1.0.0
  description: something
paths:
  /api:
    post:
      requestBody:
        content:
          "*/*":
            schema:
              type: number
          "application/create.extended+json":
            schema:
              type: array
              items:
                type: integer
          "application/json":
            schema:
              type: array
              items:
                type: number
          "application/xml":
            schema:
              type: string
          "text/plain":
            schema:
              type: array
              items:
                type: string
      responses:
        "200":
          description: "something"
          content:
            application/json: {}

We can observe that the only generated PostRequestInformation contains this line (in Java but similar in other languages):

requestInfo.setContentFromScalarCollection(requestAdapter, "application/create.extended+json", body.toArray(new Integer[0]));

Questions

Is it a deliberate choice? I'm failing at finding references in the docs.
Would it be possible to generate multiple toPostRequestInformation methods to cover all of the Content-Types handled by the server? Cannot find another issue referring to this limitation.
Do you have an idiomatic way to work around the current behavior other than fixing the RequestInformation object before sending it through the adapter?
Should the(current) implementation give precedence to application/json and, only if not found, apply cleaning?

Thoughts?

@EricWittmann
Copy link

EricWittmann commented Sep 26, 2023

In our specific use-case (Apicurio Registry REST API) there may be two separate issues:

  1. Operations with multiple media types
  2. Operations with */* as the media type

In our case we have a single operation with both of these things! Here are three examples in order of (1) and then (2) above, and finally a combination.

Multiple Media Types

openapi: 3.0.2
info:
    title: Multiple Input Types API
    version: 1.0.0
paths:
    /widgets:
        post:
            requestBody:
                content:
                    application/json:
                        schema:
                            $ref: '#/components/schemas/Widget'
                    application/vnd.widget.extended+json:
                        schema:
                            $ref: '#/components/schemas/ExtendedWidget'
                required: true
            responses:
                '201':
                    description: Widget successfully created.
            operationId: createWidget
            summary: Create a widget
            description: Creates a widget (either normal or extended).
components:
    schemas:
        Widget:
            type: object
            properties:
                id:
                    format: int32
                    type: integer
                name:
                    type: string
        ExtendedWidget:
            type: object
            properties:
                id:
                    type: string
                name:
                    type: string
                description:
                    type: string
                keywords:
                    type: array
                    items:
                        type: string

File Upload : */* Media Type

openapi: 3.0.2
info:
    title: File Upload API
    version: 1.0.0
paths:
    /files:
        post:
            requestBody:
                content:
                    '*/*':
                        schema:
                            $ref: '#/components/schemas/FileContent'
                required: true
            responses:
                '201':
                    description: Returned when a file is successfully uploaded.
            operationId: uploadFile
            summary: Upload a file
components:
    schemas:
        FileMetaData:
            type: object
            properties:
                uploadedOn:
                    format: date-time
                    type: string
                uploadedBy:
                    type: string
                size:
                    format: int32
                    type: integer
                contentType:
                    type: string
        FileContent:
            format: binary
            description: ''
            type: string

Putting Them Together

openapi: 3.0.2
info:
    title: All Together API
    version: 1.0.0
paths:
    /files:
        post:
            requestBody:
                content:
                    '*/*':
                        schema:
                            $ref: '#/components/schemas/FileContent'
                    application/vnd.file.extended+json:
                        schema:
                            $ref: '#/components/schemas/FileUpload'
                required: true
            responses:
                '201':
                    description: File successfully uploaded.
            operationId: uploadFile
            summary: Upload a file
            description: Upload a file.
components:
    schemas:
        FileContent:
            format: binary
            description: ''
            type: string
        FileUpload:
            type: object
            properties:
                name:
                    type: string
                content:
                    type: string

@baywet baywet self-assigned this Sep 26, 2023
@baywet baywet added enhancement New feature or request generator Issues or improvements relater to generation capabilities. Needs: Author Feedback labels Sep 26, 2023
@baywet baywet added this to the Kiota v3.0 milestone Sep 26, 2023
@baywet
Copy link
Member

baywet commented Sep 26, 2023

Thanks for bringing this up.
This is kind of related to #2317 for context.

But focusing on the content types here, I think there are multiple questions:

  • */* will be considered the same as application/octet-stream.
  • structured (xml/text/json/...) content types will be prioritized over non structured ones (stream, png, etc...) when both are present.
  • I don't think we have a prioritization rule for vendor vs non vendor schematized types today, it might actually depend on the order they appear in the document, or some random sorting.

With that context in mind the challenges we're trying to solve for in #2317 are:

  • How big the matrix can be (if you start having multiple success codes with different schema, different response content types and different request content types, you could end up with a dozen execution methods per operation quite easily, and that's not great).
  • Having sane naming conventions (inline type definitions, method names, etc...) than are not going to lead to collisions or breaking changes if an API suddenly supports another content types/status code.

We don't have immediate plans to solve for this now as this is limited to a small APIs subset as far as we've observed. But we're eager to read what you might propose.

@andreaTP
Copy link
Contributor Author

This is definitely related to #2317 , thanks for pointing it out, and it will be nice to hear from @darrelmiller as well.

I understand that those changes are going to impact the generated code size, and I don't have a good way to quantify it, we should probably experiment a bit here.

As per the encoding, I think that we should keep the current direct style on the "happy path" as it is today, one possibility would be to generate one single additional method that takes an extra parameter, being it the specific content type in a String form.
The selected content type can be looked up in a Dictionary to correctly invoke the appropriate "RequestInformation" builder function.
With an encoding like this, we can open up possibilities for optimizations by running some code deduplication procedure (note for later on of course).

But this is just "on top of my head", let's see what @EricWittmann comes up with 🙂

@darrelmiller
Copy link
Member

Request Body with */* as a content type

If the API Description declares an operation with a request body as */* then you should set the content-type in the request configuration otherwise we will send it as application/octet-stream. This is dependent on the fix that we discussed in issue #3378

Multiple Response media types

For scenarios where

  • an operation supports multiple response media types,
  • the schemas match
    we only want to send an accept header with media types that the client has serializers for and be able to express a preference of the client. We can identify the preferences by adding support for weights in the supportedMediaTypes input parameters. e.g.
  "structuredMimeTypes": [
    "application/json;q=0.9",
    "application/cbor;q=1"
  ],

If the requestConfiguration is not populated with an Accept header then the generated code should create an Accept header that is the intersection of the the client supported media types and the server declared media types for that operation. If there are no members of the intersection we should send the list of server declared content types for that operation in the accept header and we will treat the response as stream.

Multiple Request Media Types

If the operation:

  • declares multiple media types for the request body
  • the schema is the same

we need to identify the intersection of client supported media types and server declared media types for that operation. Once we have the intersection we should select the preferred media type and set the content-type header if it was not already set during request configuration.

This process is almost identical to what we do with setting the accept header, except we need to identify the most preferred media type as the value for the content-type.

Currently we don't identify a different set of structuredMediaTypes for request and response and therefore we currently don't have a way of expressing a different preference for requests vs responses.

If the client does not have any serializers that support the declared content types of the request body, then we have two scenarios to handle:

  • If the API operation declares just one content type, and the content-type header was not set in the requestConfiguration, then we will set the request content-Type to the value declared by the API operation and pass a stream body.
  • If the API operation declares two different media types, then we cannot default the Content-Type header. We can either generate a request executor method that has a required contentType parameter, or we can generate multiple requestExecutor methods, one for each requestMediaType. With either solution, a change to the API description that adds support for a new request body media type, would be a build time breaking change.

@andreaTP
Copy link
Contributor Author

I like @darrelmiller proposal very much:

  • calculate the intersection
  • pick the header from a weighted dictionary

What is still missing is how this will look from the user/code perspective.
Do we want to provide the media types to be selected explicitly?
Do we want to make an opinionated decision and only generate one encoding?
Having multiple functions overload with multiple possible content types? (I'm not sure how this will work for all of the langs)

@baywet
Copy link
Member

baywet commented Sep 28, 2023

@andreaTP fox context this is the outcome of a brainstorm we were having yesterday. We still have cases that are not described here: when the schemas are not the same. And we still have to solve how to detect schemas are the same or different. But I believe this first set of cases could be a good first stop as an improvement over the current implementation.

During our implementation we could assume schemas are the same, knowing some edge cases wouldn't work just yet, and solve for those later on.

To answer your questions:

Do we want to provide the media types to be selected explicitly?

Only in the case the intersection between structured types and request body described types yields an empty set, and we have multiple described types. In that case, the executor and the generator methods would have an additional parameter for the content type. In the similar case but for the response, the generated code will send application/octet-stream, people can override it (c.f. the other issue). Requesting people to provide it all the time would be additional friction for cases we have enough information for. The structured types is set by default in the appsettings of kiota, and can be changed through the relevant command line argument. (and we still have to find a place for that in the vscode extension)

Do we want to make an opinionated decision and only generate one encoding?

I'm not sure how encoding is relevant in this discussion? Did you mean content type? if so in which case?

Having multiple functions overload with multiple possible content types? (I'm not sure how this will work for all of the langs)

With the assumption of this scenario (the schema matches for all content types, or it's an un-schematized payload), we don't need to. For the Accept, the generated code can just set the intersection with the preference weight (as RFC 9110 plans for). For the Content-Type, same logic applies, only the generator select the most preferred one. The troubles come when the schemas diverge for different content types (we still have to solve for that). Also all the scenarios above make the assumption that if we have both a schematized content type and an unschematized content type for the same operation, the schematized type will take precedence and the unschematized one will be ignored.

@baywet
Copy link
Member

baywet commented Sep 28, 2023

Moved to 1.8 for now so we consider it during our next planning session. We'll probably create another issue for the less trivial cases where the schemas don't match once we have a design for that.

@github-project-automation github-project-automation bot moved this from In Progress to Done in Kiota Oct 19, 2023
@andreaTP
Copy link
Contributor Author

Thanks a lot for taking this one on you @baywet ! Much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request generator Issues or improvements relater to generation capabilities. WIP
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants