Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Internal.Utilities project #11227

Closed
wants to merge 5 commits into from
Closed

add Internal.Utilities project #11227

wants to merge 5 commits into from

Conversation

oleksandr-didyk
Copy link
Contributor

Related issue: dotnet/source-build#3040

Adding an internal utilities library for re-used code

Copied in VersionDetailsParser from Darc and FileSystem from Arcade Services + will replace existing implementation with this library after the PR is accepted


public string? GetExtension(string? path) => Path.GetExtension(path);

public string PathCombine(string path1, string path2) => Path.Combine(path1, path2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of these path operations really don't need to be abstracted. I'm not sure I see the point of adding a layer that's calling a pure logic method underneath. Sure, the things that actually DO file operations, yeah, but anything on "Path" should just be used directily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. We could remove all methods without file system side effects. Only risk is that it might cause failures in unit tests when run on Windows vs Linux.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we then remove them is the risk of test failure is not worth it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd check how many are if there are many and fix them right away, if it's not too many?

Copy link
Contributor Author

@oleksandr-didyk oleksandr-didyk Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 it seems, which should not be too many. Furthermore, these changes only copy in the code from Arcade Services, so in this particular PR we don't need to change a lot

Once we use the code in other places like Darc and Arcade it might be a bit more

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There won't be much in arcade-services. I only copied the class recently and am the only user there so it's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@premun so I can remove all of the Path operations?

Copy link
Member

@premun premun Oct 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So.. the issue is that many tests just mock PathCombine to $"{first}/{second}" and the whole testing can than expect that we use UNIX style paths.

If we remove these, the tests might start working different way on Windows / Linux. We already have a tests in arcade-services and arcade (and I think some other repos) that don't run well on both Windows / Linux for this exact reason.
Example: https://github.com/premun/arcade-services/blob/prvysoky/testable-vmr/src/Microsoft.DotNet.Darc/tests/Microsoft.DotNet.DarcLib.Tests/VirtualMonoRepo/VmrUpdaterTests.cs

I would not say that it's problematic to keep these but I also think it's not the end of the world to fix the tests. However, we're talking about what - 4 lines of code here? Furthermore, I think it's nice if you can fully remove the System.IO using from classes as it means there's no real IO operations left.

But I don't have a strong opinion on this.

@ChadNedzlek
Copy link
Member

"Utilities" is a really vague name for an assembly. "SourceBuildInfo" and "DependnecyDetail" don't feel like utilities to me. Most of this code (other than the FileSystem stuff), feels like it's really source build stuff, so should probably be in an assembly dedicated to source build.

@premun
Copy link
Member

premun commented Oct 12, 2022

Most of this code (other than the FileSystem stuff), feels like it's really source build stuff

It's actually Maestro stuff around Version.Details.xml files. Code used by both source-build and DarcLib. That's roughly a TLDR why we need this library.

The naming should be a subject to discussion on this PR so it's good you're bringing this up. Utilities doesn't sound too off though for classes that are helpers around some Arcade/Maestro concepts.

@oleksandr-didyk
Copy link
Contributor Author

@ChadNedzlek would you have any examples of what you think the library should be named based on @premun 's answer?

@@ -15,6 +15,10 @@
<PackageReference Include="Newtonsoft.Json" Version="$(NewtonsoftJsonVersion)" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\..\Microsoft.DotNet.Internal.Utilities\Microsoft.DotNet.Internal.Utilities.csproj" />
Copy link
Member

@ChadNedzlek ChadNedzlek Oct 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like every single project probably doesn't need a reference to VersionDetailsParser or MsBuildPropsFile. Magic dependencies can be confusing to track down.

@ChadNedzlek
Copy link
Member

It seems like if it's about virtual mono repo stuff, that should probably be reflected in the name. The FileSystem stuff could probably be in it's own assembly with something like "Microsoft.Internal.DotNet.FileSystem.Abstractions" or something like that. It feels distinctly different than the other stuff, in that's it's pretty much just a file system abstraction, without any real product identity/logic in it. The dependency stuff feels different enough to me that it should be in something else. Maybe Microsoft.Internal.DotNet.VirtualMonoRepo or something?


public interface IVersionDetailsParser
{
IList<DependencyDetail> ParseVersionDetailsXml(string fileContents, bool includePinned = true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this one be an extension method? It seems like everyone will always implement it the same way (load an XmlDocument, parse it, and then call the other overload)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods were extracted from internals of DarcLib which uses both of these. We will need to mock these eventually when testing the code there and that won't be possible with an extension method.

We can implement it on the interface but I would personally stay away from that feature.

Regardless, I don't think we will have more implementations than this one as this one is specifically here for easier re-use and the interface for testability. So I don't think it's a big issue because there won't be any other implementation doing the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you specifically need to mock the string one with something other than "make an XML document out of it"? You could easily mock the other method and the extension method would call it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But can you explain why? Why would you split logic belonging to this single class into 2 different classes?
Somehow you mind that there's two methods that people would have to implement (even though we don't expect more implementations) but adding a new static class (that's very hard to test) would be fine?

And testing this? You'd have to always provide a full XML document? (see my other comment explaining why I think that's not a good idea)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to provide a full document either way, the only implementation you'd be testing requires it. And if you are mocking stuff, you can just mock the single method that's real, rather than having to mock both (since you wouldn't know which the stuff was calling, or you'd have to change the tests a bunch if you wanted to use the other inside).

You could use a default interface method if you really were worried about having multiple classes, but to me (and to most of runtime) having "interface IThing" that has some helper methods on "static class Thing" is a pretty normal pattern, and reduces the burden of having to have every implementation implement more methods. It also reduces testing if we ever do create a second copy of the interface, because there would only be a single implementation of the extension method that needs to be tested, rather than two for every implementer.

Why is a static class any harder to test than the instance implementation of it? It would be easy, you'd make a Mock, and then call the extension method on it and make sure it produced the correct XML document. Right now it's MUCH harder to test the overload, since you can't actually isolate the "string" version of the method, so you have to duplicate all the tests for the XML tests to the string overload, and try to ensure they have the same behaviors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to provide a full document either way, the only implementation you'd be testing requires it.

Yes, but only once when testing this one class responsible for XML (so it's kind of understandable there will have to be some XML somewhere) not every time we test any class that depends on this!

And if you are mocking stuff, you can just mock the single method that's real, rather than having to mock both (since you wouldn't know which the stuff was calling, or you'd have to change the tests a bunch if you wanted to use the other inside).

Why would you need to mock both? The class that you're testing usually just calls one of these so you mock that one? So having the code here or in an extension + having 1 mock is the same amount of code?

Also if you check the code, what you propose means you would have to do then really crazy things instead of just mocking a very simple method? A simple example:
I have a method calling several times into this logic, I want to be able to return different sets of DependencyDetail.
I could easily setup a mock that does things based on the contents of fileContents rather than xpathing the XmlDocument to see what's inside..

You could use a default interface method

Yes, this is what I am proposing above as a different solution but it's a very nasty thing once inheritance comes into play as child class can't call into this method easily since it's not on the parent but on the iface. I heard even Roslyn guys say that this feature went sideways. Personally, I use it for pure functions only and this case is not a pure function.

Why is a static class any harder to test than the instance implementation of it?

Not the method itself but code dependent on this..

Right now it's MUCH harder to test the overload, since you can't actually isolate the "string" version of the method, so you have to duplicate all the tests for the XML tests to the string overload, and try to ensure they have the same behaviors.

The piece of code we're talking about is not going away with moving it into the extension. Having it here is not "MUCH harder" as it would also not go away. It's just MUCH harder to test code dependent on this.


if (dependency.Attributes is null)
{
throw new ParserException($"Dependencies cannot be top-level and must belong to a group such as {ProductDependencyElementName}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message doesn't seem right.

#nullable enable
namespace Microsoft.DotNet.Internal.Utilities;

public interface IVersionDetailsParser
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious what other implementations of this class might exist. It's almost pure logic with nothing that would every need to be mocked or use some other weird dependency, and the XMLness is baked into the interface, so it's not like we could have a "JsonVersionDetailsParser" or anythign. Do we actually need an interface for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making DarcLib testable is a goal too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand how this interface increases testability. Any test would be easier to just pass a well formed XML document and use the real implementation, rather than have to mock the behavior of this interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are couple of reasons why I don't understand why you propose this:

  1. This class would have to be made testable so that it's not using real file system (and that would be quite hard) OR you would have to keep writing XMLs onto the file system during tests and that is a very error prone, slow and quite lengthy operation.
  2. I think it's much easier to mock strong-typed declaration of the DependencyDetail class rather than adding files / interpolated XML multi-line strings in your tests.
  3. This class abstracts away the work with the XMLs (one of principles of OOP - encapsulation). If the XML changes (attribute names, how we organize source-build tags...), or we start using props files instead or whatever... you won't have to go and change manually XMLs in all tests. You will keep using the DependencyDetail classes and not worry about implementation details.
  4. We already use DependencyDetail mocks in many arcade-services tests and we have factories for it. E.g. https://github.com/dotnet/arcade-services/blob/3a8b32b60e6030e921756dd7ae8865d72fbc8898/src/Microsoft.DotNet.Darc/tests/Microsoft.DotNet.Darc.Tests/DependencyCoherencyTests.cs#L1445-L1460 You can also see how much XML you would have to prepare for a test like this. You would end up creating a factory for the XML to which you would have to provide everything that you provide to the DependencyDetail class.

I really don't understand where you're coming from but I guess you must have an angle that I don't see. Can you help me understand why you propose this? (same with the extension method).

@premun
Copy link
Member

premun commented Oct 27, 2022

It seems like if it's about virtual mono repo stuff

It's not. This class is used by:

  • Maestro (DarcLib) when working with repo dependencies
  • Source-Build (which will be in Arcade)
  • dotnet/installer (tarball generation)
  • VMR (in installer and DarcLib)

I'd say it's pretty generic and that makes it a good candidate as we need to share this between 3 repos.

The FileSystem stuff could probably be in it's own assembly with something like "Microsoft.Internal.DotNet.FileSystem.Abstractions" or something like that. It feels distinctly different than the other stuff, in that's it's pretty much just a file system abstraction, without any real product identity/logic in it.

Sure, but I guess there will be more stuff like this that we already share between repos (I remember @missymessa and @michellemcdaniel might have some tips for things that they mentioned they copy around).

@ChadNedzlek
Copy link
Member

Sure, but I guess there will be more stuff like this that we already share between repos

We shouldn't create one giant "all the random stuff we want to share assembly". If we end up with more stuff related to this versiony/dependency flowy stuff, it can go in whatever assembly this is. If it's some other code, unrelated to that, then it should go in its own assembly.

My understanding was that all four of those things is VMR, so maybe my understanding is wrong. Maybe "DependencyManagement" or something else, but it seems like some sort of concept should be able to be wrapped around it.

@premun
Copy link
Member

premun commented Oct 31, 2022

I like Microsoft.DotNet.DependencyManagement or Microsoft.DotNet.DependencyManagement.Utilities

@oleksandr-didyk
Copy link
Contributor Author

Hey @ChadNedzlek and @premun, just to outline a few things so that this PR can get merged:

  • This PR is just moving code from Arcade Services to Arcade, so I would be strongly against introducing any new changes to the existing logic. I would like to merge this PR, replace existing usage of moved code in Arcade Services with this library and afterwards we can return to discussions started here
  • I will of course comply with all style changes requested by Chad
  • The library name can be changed to the proposed Microsoft.DotNet.DependencyManagement if its A-OK for everyone

Please let me know if this is acceptable to you. Thanks!

@missymessa
Copy link
Member

Is there a reason this code is being moved to Arcade? Is something in Arcade going to be consuming this code now?

@oleksandr-didyk
Copy link
Contributor Author

I think Arcade was suggested as the best place to put the code in question so it can be re-used throughout multiple repos, including arcade services (as the repos in question would get updates through the dependency flow)

@missymessa
Copy link
Member

Ah, I see this PR is related to the thread that was started last month.

Here's my opinion (and it's just an opinion, so take it or leave it :D): Functionality in Arcade should be used by the product repos. If that's the case here (and based on the list Premek provided, that looks like it may be the case), if there are things in Arcade that will consume this code, can we also include that refactor in this PR? If the things that will consume the code changes are in consuming repos, can we make sure that we have issues to handle that refactor?

My concerns with this work are 1) we may be moving functionality that is only used by dnceng into Arcade, which is shared infra for product repos (as noted by Premek, it sounds like there are product repos that would use this functionality, so this may be a moot concern), and 2) this code will be orphaned if nothing consumes it (just want to make sure that once this code is migrated that it will be used by repos that aren't owned by dnceng).

@premun
Copy link
Member

premun commented Nov 2, 2022

I think Arcade was suggested as the best place to put the code in question so it can be re-used throughout multiple repos, including arcade services (as the repos in question would get updates through the dependency flow)

@oleksandr-didyk isn't also one of the goals to use this logic directly in Arcade for the PvP stuff? This would be used then in every repo through the SourceBuild leg?

@missymessa additionally, this code will be used by dotnet/arcade-services and potentially dotnet/installer.

@missymessa
Copy link
Member

arcade-services is one of our team's projects which is where this code is originating from, so the fact that this code will once again be used by arcade-services isn't a selling point, in my opinion, to move it to another repo, especially once that was created to support common infra for the product repos.

@premun
Copy link
Member

premun commented Nov 2, 2022

@missymessa as noted above - I think it will be used for SourceBuild legs which will be running Arcade YAML templates in every product repo (and the SB tooling is in arcade) so arcade needs this and so the other repos. I think this is the reason we started all of this (Alex needs these classes from DarcLib in Arcade).

I think it's better to have this code in arcade and flow it to arcade-services as opposed the other way because arcade-services is .NET 6.0 so it would have to do something special just to build 7.0/8.0 things. It's better to multi-target 6/8 from Arcade.

@oleksandr-didyk
Copy link
Contributor Author

Closing due to the PR being quite outdated + new re-org

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants