-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature | Split Azure dependent functionality in a separate NuGet Package #1108
Comments
found my answer #1010. so will close this. it still seems very weird to me that a sql client should have a dependency on a cloud infrastructure library. For hypothetical comparison, would it be ok to also take dependencies on google and aws libraries. |
@SimonCropp Microsoft.Data.SqlClient also provides connectivity to Azure cloud services (Azure SQL DB, Azure Synapse Analytics, etc) that are very similar to SQL Server on-premise. The Azure.Identity library provides authentication functionality for those Azure services. |
I was also wondering if there is a way to separate functionality into multiple DLLs, i.e.
With .Net 5, a project which uses this library and doesn't use Azure gets at least 3 new unused DLLs in the bin directory: |
i had the same expectation as @virzak |
@SimonCropp reopen then? :) |
@virzak i think i have raised my concerns. if the sql client team think they should be address, then they can re-open the issue |
We are considering this internally, timeline not certain yet. But we can reopen for sure! |
Agree with others. This library is supposed to be replacement for the standard library already in the framework. Azure support is fine but shouldn't be auto-included in a "core" package. This would be equivalent to having the core framework depend upon Azure as well. If my app is using AWS or GCP then having an Azure dependency looks wrong. |
Agree, too. Mostly because of the dependencies Azure.Identity brings in. There are really many packages needed for Microsoft.Data.SqlClient |
FWIW, I also just wasted a bunch of time tracking down why these Azure assemblies were being included in my build output. I'm not using anything Azure related and so it made me worried some malicious package had snuck into my dependency graph. Took me a while to figure out that MDS was the culprit. It was annoying enough to me that I decided to switch back to SDS, which is probably not something y'all want to encourage. |
Agree on this as well. Azure support should be opt-in and a separate thing. Wasted a decent amount of time today again in trying to figure out why my solution that has nothing to do with azure was referencing azure packages. |
Just noticed this because of some versions conflicts on packages that seemed completely unrelated to SqlClient (OpenIdConnect). |
Agreeing on this. Comparing the dependencies of Microsoft.Data.SqlClient with those of the classic System.Data.SqlClient, the new library adds some more other than Azure which should be not needed in at least some scenarios:
The following also seem to be meaningful only for Windows:
I do not know how heavy those libraries are and whether their presence might stem from simple refactorings, but I guess their presence in a core package should be evaluated. |
As someone who does not know the internals of this project at all, I tried downloading the code and simply looking for those package names via CTRL+F in *.cs files. Here are some results:
Overall the "offending" packages seem quite well-encapsulated. |
I think I might see if I can make a local clone of sqlclient and see what I can do to open an pr to move all azure dependencies into their own package where one must append This is because I also face this issue where I do not use Azure stuff at all (however I do use efcore). As for win32.Registry that package is supposed to come from the ref pack that is included by default for the default runtime that ships with the .NET SDK. Same for System.Security.Principal.Windows. I have an pull request that removes the packages for .NET Core/5 and 6+ applications however it is blocked for now until they can upgrade the CI to install the .NET 6 SDK everywhere so the build of that pull request will all pass. If I do split them it would help eliminate even more dependencies that cause me pain which would always be good for all of us. Also note: when I do the split I will also upgrade it to the latest version of the Azure SDK because they recommend always keeping it up with the latest stable. As for .NET Framework, I am not sure if I should make any changes to it since I do not really care for .NET Framework at this point. |
Bad news, the Azure Enclave Provider is used in EnclaveDelegate.Crypto.cs which I do not know for sure if that file is only for Azure or not. This has made it a lot harder to separate it. Luckily it looks like TDS is some sort of remote specific stuff? Perhaps Azure specific? |
It looks like Azure cannot be split because it's to far engraved into the dependencies of basically everything in SqlClient needlessly. At least currently it just does not seem possible at the moment. |
I tried taking a quick look at this and |
Yeah, I tried to split that into |
Alternatively I could rig it up to where it could look for enclaves providers registered with DependencyInjection however that would add another dependency and obtain the enclaves providers registered there everywhere they are needed (however I already use Dependency Injection significantly so it does not bother me). However it would bother someone who is not using Dependency Injection but is using SqlClient. |
Would the unused elements of this get stripped at compile time, with trimming options? |
The recent deprecation of System.Data.SqlClient (#2778) has re-raised this issue, and I think it makes it more important. If an organisation needed to be able to connect to SQL Server, they'd previously be able to use either S.D.S or Microsoft.Data.SqlClient. Although this was the preferred library, any organisation using vulnerability scanners would have discovered CVEs in the Azure libraries, found that this was pulled into the dependency chain via M.D.S, and had the option to fall back to S.D.S as another supported library. That choice will no longer exist - those organisations will have to choose between using a now-unsupported library, or facing an extra background hum of vulnerabilities which need to be accounted for. Even in a "happy path" of something connected to a local SQL Server via Windows authentication, there's still overhead of having to account for these CVEs, and we're relying on security/compliance teams being willing to accept the developers' word that it's not in use and being kept up-to-date if requirements change... In the absence of trim compatibility, Chisel is a good way to fix this - but it's got its own restrictions because it has no way to know the dependency chain of a class library's consumers. This can be a problem in applications which use Clean Architecture as a point of reference. It's always going to be difficult to split the packages now that they're in wider use, but the experience for developers migrating to M.D.S as a result of the S.D.S deprecation seems like it's going to be pretty poor too. The consensus in the final comments of #2247 seemed to be to have Microsoft.Data.SqlClient and Microsoft.Data.SqlClient.Azure. Could we start to release these packages to NuGet as part of the v6 release? I'm not sure whether we'd want the M.D.S.Azure and M.D.S packages to be identical, or whether it'd be better for M.D.S.Azure to start life as nothing more than a shell which references M.D.S. The goal would be to allow us to start recommending that migrators from System.Data.SqlClient which use Azure functionality should use Microsoft.Data.SqlClient.Azure, and migrators which don't should use Microsoft.Data.SqlClient. Developers would hopefully also start shifting to the appropriate package organically, making this slightly easier to tackle when we get to it in the future. On a tentative side point: if both packages were published and M.D.S.Azure had a different application name in the connection string, perhaps the Azure SQL team might be able to review logins with a matching application name and the appropriate authentication method, then notify the tenant owners of an upcoming breaking change... |
@edwardneal Very interesting proposal, that should pave the path for a removal of Azure dependencies in version 7! So what you are proposing is essentially a new package initially with the exact same binaries as M.D.S. today.
Curious, how would you implement this with a NuGet package? |
@ErikEJ That's correct, yes. I'm not completely sure whether the extra package should contain the same binaries as M.D.S, but the fact that the package exists means that the S.D.S migration guidance can be issued, developers can start to be informed about the need to switch packages, etc.
I'm thinking of a NuGet metapackage - so the package wouldn't contain any files, but would define a dependency on M.D.S (and possibly Azure.Identity.) One example of this type of package is SpecFlow.NUnit.Runners. We might not want it to consist strictly of dependencies though - I could see some value in having a public interface (perhaps a static method with an empty body) which developers are expected to call in order to register the Azure authentication/AlwaysEncrypted components. Later revisions of v6 could start to implement this as any required refactoring inside M.D.S is completed. Essentially, v6 would contain the public-facing interfaces needed to start asking developers to migrate and v7 could move the dependency itself. |
I would assume that adopting the .Azure package would not require any code changes for existing users. (Only a package switch) |
That's a fair assumption, and in that case I think it'd be pretty simple to set up the metapackage and then to populate it with the Azure-referencing binaries as a later release. My point of reference comes from packages like Npgsql.OpenTelemetry, which reference the underlying package and only directly contain the OTel-specific shims. I can see why we'd want to treat M.D.S differently because of its history though. |
Adds a new M.D.S.Azure package, that Azure SQL Database users can be guide to use going forward. This package will in the future enable removal of Azure.Identity from the MDS package
Just want to note that I was imagining Microsoft.Data.SqlClient.Azure as a plugin into SqlClient, so that users would reference both packages; this is in contrast to this plan, where M.D.S.Azure would be a copy-paste of M.D.S, with the Azure stuff added in. The problem with the "copy-paste" model is that it doesn't scale well; if there's some need for some additional functionality which also requires some other external dependency, you'd not be able to do this trick again. I'd recommend trying to think about exposing an authentication plugin API, which external packages (such as M.D.S.Azure) would be able to hook into in order to do their work. In the ideal design, this extensibility would be exposed on SqlDataSourceBuilder - and M.D.S.Azure would provide a simple extension method over SqlDataSourceBuilder to do the appropriate configuration - but some global/static way to register the plugin would also be necessary for anyone not using SqlDataSourceBuilder. Some might object that a plugin model is more of a breaking change: current users would have to have change their program to add the opt-in. But ultimately you'll be asking users to change their package references anyway (from M.D.S to M.D.S.Azure) so I'm not sure that's a meaningful difference. |
@roji I understand your thoughts, but I think the switch to .Azure should only require a package change. New external dependencies are implemented as plug ins today, like the KeyVault provider. |
I haven't thought this all the way through but it seems like the copy-paste approach is going to cause problems as well. Here's the scenario I wonder about. There is a package A that relies on MDSv1 today. A new MDSv2 package is created containing a copy of the binaries (not a metapackage). A project depends on package A and also, either directly or indirectly, takes a dependency on MDSv2. Now a project has packages that contain the same binary names but potentially different versions. Which one actually gets copied to the output and therefore loaded at runtime? AFAIK package ordering is undefined so there is no way to control which package wins. Since NuGet doesn't look at the assemblies within a package then it wouldn't see a conflict either. The obvious solution is to update package A to rely on MDSv2 but that isn't reliable since a package may not be updated anymore, might be a transitive dependency from another package you can't control, etc. Not sure if this is an actual issue and how to work around it. The alternative of a metapackage seems to solve that issue but we, where I work, don't recommend metapackages. They introduce more problems then they solve. Transitive security vulnerabilities are a common problem and metapackages that aren't updated cause warnings. The only workaround is to take a direct dependency on the package which defeats the purpose of metapackages. I think I heard there is some option you can set in your |
@mtaylorfsmb the metapackage is a point in time thing, it will eventually have its own binaries. |
@ErikEJ Correct me if I'm wrong but there will still come a day when you're back to the first scenario I mentioned. Code that relies on MDS.Azure (metapackage) which relies on MDSv1 will updated to MDS.Azure.vNext (shipping the binaries directly). But projects that rely on MDSv1 transitively (for whatever reason) will suddenly have a conflict when they update MDS.Azure.vNext because 2 different packages now have the same binaries. It is just kicking the problem down the road, or maybe I don't understand how this is going to work ultimately. |
@roji a lot of this work has technically already been done. There's a public I thought a little about what that'd require, whether M.D.S should treat M.D.S.Azure uniquely because it was embedded inside the core library, etc. It's a bit of a moot point though - I didn't think it'd be ready in time for an early enough preview of M.D.S v6, which is where the idea around a metapackage and possibly an empty piece of public API surface comes from. This does all depend on whether we can make transitive dependencies work as they need to though. From checking NuGet's dependency resolution rules @mtaylorfsmb, I think there'd only be one problem to consider? We can safely expect that M.D.S.Azure wouldn't always be a metapackage, that during its time as a metapackage its version would be kept in lockstep in M.D.S, and it would explicitly pin its reference to exactly the same version of M.D.S. This would mean that we're not going to run into transitive security vulnerabilities from the metapackage - it'd be published from the same repository at the same time as M.D.S. The scenarios I have in mind for the dependencies are:
I'm not pleased with scenario 2. Even in that scenario though, if a developer migrates to M.D.S.Azure, sees this problem and adds another extra reference to M.D.S, they've still put themselves into a non-breaking position for v7.0. We'd need to document that, so they don't immediately revert back to referencing M.D.S exclusively. I could also see it being frustrating for downstream libraries referencing M.D.S. The really "interesting" point comes when we switch from a metapackage to something with real content. In scenario 2, a plugin route's actually got a pretty simple exit ramp: we'd bump M.D.S and M.D.S.Azure to v7, loosen M.D.S.Azure's dependencies to reference M.D.S >= v7, and the scenario becomes irrelevant. If there's a second copy of the M.D.S binaries in the M.D.S.Azure package, I'm not sure which binaries would be used, or what might happen in more complicated package dependency trees with multiple versions of M.D.S and M.D.S.Azure. If that made it through the compiler, I wouldn't be too surprised if we found that the application code was using M.D.S.Azure, but that the libraries referencing M.D.S were using M.D.S. While not a strong preference, I'd personally prefer the plugin approach in the long run because I think it'd make stepping M.D.S.Azure from a metapackage to a real package simpler. |
I also think a plugin approach is the correct approach in the long term but understand the timing pressurws. Is there a way that we could perhaps implement the plugin method but only sets a variable. Then when we pass in an azure connection string a warning is logged if that variable is not set? |
@ErikEJ @edwardneal to make sure we're talking about the same things, my comments below are about the proposal of (eventually) having a copy-paste, i.e. both M.D.S and M.D.S.Azure containing e.g. a SqlConnection type, and without M.D.S.Azure referencing M.D.S. In that scenario, what would e.g. the EF SQL Server provider reference? Either:
This is related to the difficulties @mtaylorfsmb pointed out above... People would very easily find themselves with both SqlClient packages somewhere in the dependency graph, and at that point you have two versions of SqlConnection and things simply become unmanageable. In contrast, if M.D.S.Azure is simply a plugin which only contains Azure-related auth stuff, and references M.D.S, then EF (and any other library) would simply reference M.D.S, and the user would decide whether to also reference M.D.S.Azure.
I understand and appreciate the desire to make this transition as painless as possible... But note that this whole issue (splitting Azure.Identity out of the base M.D.S) has been held back up to now because of (legitimate) backwards compats concerns. Now that we have a general willingness to explore a breaking change here, I really hope we don't produce a problematic, sub-standard solution again for the same reasons... If we're already breaking (via the package reference), IMHO might as well do things really right even if it means requiring users to paste another line of code somewhere... |
@roji Got it! And then provide an actionable error message with a link to good docs when attempting to use Azure auth with the "base" package... |
That sounds sensible to me too. I've seen #2823 and it looks good to me, thanks for opening that. If the SqlClient team are happy with the design, it'd be good to publish it as part of the next preview of v6.0 so we can start the split |
Yes, exactly! As long as you provide crystal clear fail-fast behavior with an informative exception pointing to a good page explaining what to do to (switch to M.D.S.Azure, add this in your code), I really think it's a very reasonable situation... When this is possible, it's the "best kind" of breaking change (compared to breaking changes where the behavior silently changes or whatever). Users will still complain (some will complain no matter what), but the majority will simply do the change quickly and move on. |
BTW if you do go with the plugin approach, I'm not sure there's value in actually publishing an M.D.S.Azure package (as in #2823) before it's fully implemented; that wouldn't be very useful, given that users will have to add the code gesture anyway when the work is fully done. So rather than telling users to switch to M.D.S.Azure today, and then add the code gesture tomorrow, you might as well just publish M.D.S.Azure tomorrow and tell them to do both at the same time? |
@roji The plugin approach makes most sense to me. Also, I would like to stay away from the re-branding of any SqlClient to SqlClient for Azure. This means that there is only one client with extensions/plugins to help light up the added scenarios in Az SQL DB/On prem, but apps / ORMs like EF only care about one package i.e. MDS, which remains the gateway into SQL Server. Of course, all this while solving the core problem at hand, i.e. Azure related dependencies should be optional for those who don't need them. This effort should look at
I propose that MDS vNext, should remove the auth providers for atleast Entra ID auth (AAD auth), out of the code. I had a chat with @David-Engel and he mentioned that there are dependencies on MSALException for retry of token fetching, and there are few other capabilities for which the driver may have taken a dependency on the internals of the AUth providers. Hence we will have MDS and M.D.S.Azure.Authentication. MDS.Az.Auth depends on MDS (for the provider base classes) and Azure libraries. vNext Experience:
For those who dont care about azure related capabilities: Just upgrade to MDS vNext At this point, we still need to make MDS aware about the MDS.Az.Auth providers
And yes, we need to look into the behavior with SqlDataSource as well. This is a classic case of leveraging the data source builder, but I first want to focus on the declarative way of making this work as well a NetFx users. |
Though I have tagged @roji, I would like to invite comments from @edwardneal and @ErikEJ who have gone into the depths of this implementation, about the feasibility of the proposal. |
Thanks for this @saurabh500. I hadn't actually spotted the extra dependency on MSALException, but that's correct. For everyone's reference, this is largely in In terms of the vNext (and the v6) experience: looking solely at logistics, I'm a little concerned that we could end up telling developers to migrate from SDS to MDS, then to add another reference to MDS.Az.Auth six months later. My main reason for asking for the front-loading of the extra package was to be able to simplify that and immediately tell migrators that they should use MDS if they use on-premise servers and MDS.Az if they use Azure SQL (or other Azure components.) To lift that same motive to a plugin model: would there be enough time to provide a v6 MDS.Az.Auth package consisting of nothing but an empty API surface? This'd provide migrators from SDS with a single choice to make, and would give existing users of MDS a full major version to bring their dependencies into line. Splitting into two packages in time for vNext definitely sounds good to me. I think there are a few pieces of work I can help with, pending any comments. |
I think most developers at this point already use MDS, actually |
I don't see why all of this could not be done for vNext given there is resources to implement it. |
It seems feasible to me. Also I would like to eventually convert this issue to a discussion and open a new issue with the exact proposal in the description and link that to PR(s) We can divide and conquer this. I will reach out for help with some of the items. |
@ErikEJ I can't object to that statement but would like to qualify that this likely means developers who do in-house develpment instead of ISVs. For us, MDS is currently a no-go due to the dependencies and we cannot migrate our products from SDS to MDS until the dependency problem has been solved. |
@MichaelKetting +1 from my side. This dependency in combination with the necessary security updates is a pain, preventing us from migrating to MDS as well. Having software installed on-prem (not having a single SAAS instance) requires stability that is lacking there. |
why does Microsoft.Data.SqlClient v3 reference Azure.Identity?
The text was updated successfully, but these errors were encountered: