Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More a query really: How big is too big for files in the SDK? #7934

Closed
lawrencegripper opened this issue Mar 19, 2020 · 4 comments
Closed

More a query really: How big is too big for files in the SDK? #7934

lawrencegripper opened this issue Mar 19, 2020 · 4 comments
Labels
Data Factory Mgmt This issue is related to a management-plane library.

Comments

@lawrencegripper
Copy link
Contributor

lawrencegripper commented Mar 19, 2020

Bug Report

Hey this is a bit of a weird one and I'm not sure how to raise and what, if anything, needs to be done. Felt a bit wrong to me so wanted to check.

The models.go file for DataFactory is 8mb and around 200k lines long.

image

image

https://github.com/Azure/azure-sdk-for-go/blob/090dc0ee4d8d2d60e2a9525774d967a4111a2b0c/services/datafactory/mgmt/2018-06-01/datafactory/models.go

  • What happened?

I went to the file trying to work out what properties I could use with the model. I couldn't practically find the thing I wanted to look at in such a large file. Granted cloning the repo and using IDE features to navigate to the type I needed got me there so it's not a world ender - more of a shock.

One thing I wanted to check is that this isn't some king bug in the code gen causing the file to bulk out. For example they're are 178 identical AsAzureTableSource blocks which all return nil, false just on different structs. This pattern is repeated for a lots of As* funcs.

image

  • What did you expect or want to happen?

I'm not too sure but I wasn't expecting an 8mb 200,000 line models.go file.

  • How can we reproduce it?

Open the file: https://github.com/Azure/azure-sdk-for-go/blob/090dc0ee4d8d2d60e2a9525774d967a4111a2b0c/services/datafactory/mgmt/2018-06-01/datafactory/models.go

  • Anything we should know about your environment.

Nope

@ArcturusZhang
Copy link
Member

ArcturusZhang commented Mar 20, 2020

Thanks for the issue @lawrencegripper !
Here is some point that may help resolve your questions.

  1. The go SDK (to be more specific, the services folder) is generated from the azure-rest-api-specs using the autorest tool. And the go code is generated from its plugin autorest.go.
  2. The models.go file stores the definitions of every model in the service rest api specs. When a service grows larger, the model.go file will grow along with it. Therefore for large services, the model.go file must be large. This is actually by design for the code generator, to put all the model definitions in one file. You can take a look at the swagger that generates this gigantic monstrosity here, and remember that a swagger file can reference others, therefore these files are also counted as a part of this swagger
  3. The weird part you spotted in the model.go file comes from the discriminator of the corresponding specs. A discriminator is used for the rest api specs to represent the polymorphism, to make a certain field can be returned as different types. Because golang does not support inheritance, therefore the code generator first generates an interface (for instance the BasicTabularSource) for these kind of polymorphic objects with several method as type assertion, and then generates those structs which are implementing that interface. How about that, the TabularSource has 56 sub-types, and the go code generator will generate 57 type assertion functions for each of them, therefore we have 57 * 57 * 4 lines used for this interface. Usually for other service, there will not be so many discriminator used....

Hi @jhendrixMSFT do you have more information on this? I hope the track 2 generator can avoid this problem, the model.go file cannot even open on GitHub because it is too big.

@ArcturusZhang ArcturusZhang added the Mgmt This issue is related to a management-plane library. label Mar 20, 2020
@lawrencegripper
Copy link
Contributor Author

@ArcturusZhang Thanks for taking a look at this one, that makes lots of sense to me and explains all the repeated blocks.

Fingers crossed newer generator can avoid this (tho not sure there is an easy answer to polymorphism in golang).

@jhendrixMSFT
Copy link
Member

Agreed the current implementation for polymorphism is sub-optimal. We have a better design for handling polymorphic types in our track 2 code generator which will eliminate the bulk of this code. It's a pretty significant redesign, so no plans to back-port it to this version as it's a big breaking change.

@ArcturusZhang
Copy link
Member

If we all agree, I would close this issue in advance. Please feel free to comment or reopen. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Data Factory Mgmt This issue is related to a management-plane library.
Projects
None yet
Development

No branches or pull requests

3 participants