Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up loading of source code #323

Merged
merged 1 commit into from
Aug 13, 2020
Merged

Conversation

Helcaraxan
Copy link
Contributor

@Helcaraxan Helcaraxan commented Aug 5, 2020

The current approach of loading the source code of targeted packages has some inefficiencies that may induce considerable slowdowns and unnecessary lengthy runs when using mockery on larger codebases. Most notably it always defaults to loading individual files, even when specifying an entire package as target.

This PR speeds up the processing by rewiring the Parse and Walk methods to accept generic target patterns instead of only directory and file paths. This removes the need for the CLI tool to invoke package.Load() to determine the underlying files when given a package pattern and also means that all the source code is loaded in a single call to package.Load() instead of one call per source file in the targeted package(s).

A secondary advantage of this rewiring is that the Walker struct and it's API now also accept a package pattern in its BaseDir field making it more user friendly for users invoking the internals of mockery themselves instead of using the command-line tool as they benefit from the same speed-up (I am one of those).

There are no breaking changes in either the API or the semantics of the CLI tool and its implementation packages.

The PR also contains a few drive-by fixes:

  • Walker.doWalk now properly reports errors.
  • The removal of an unused test utility function.
  • The fixing of tests that were silently failing due to Walker.doWalk not properly reporting errors.
  • The deprecation of the Walker.LimitOne field as it has no longer any speed-up benefit when set internally by the CLI side.

Some anecdotal testing on a large private codebase that generates 100s of mocks from ~150 packages by calling Walker.Walk via internal tooling shows a runtime reduction of 65+% from 180 seconds down to 50-60.

@LandonTClipp
Copy link
Collaborator

Fantastic work, I will review this more in-depth over the coming days when I get some free time. Let's get this merged.

pkg/parse.go Outdated Show resolved Hide resolved
// A file should be passed directly to the package loader as a 'file' query.
if filepath.Ext(pattern) != ".go" {
return fmt.Errorf("specified file %q cannot be parsed as it is not a source file", pattern)
} else if strings.HasPrefix(info.Name(), ".") || strings.HasPrefix(info.Name(), "_") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not different behavior? I may be wrong but I dont' see anywhere where mockery would have skipped .* or _* files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some existing logic that I took from Walker and moved here. I am not partial to keeping or removing it, I just moved to keep the existing semantics. 🙂

Technically it would be a breaking change to remove this.

pkg/parse.go Outdated Show resolved Hide resolved
pkg/parse.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@LandonTClipp LandonTClipp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but made a few requests for change.

Copy link
Contributor Author

@Helcaraxan Helcaraxan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick review. I addressed / responded to your comments. 🙂

As a side-note, and not something for this specific PR: with the changes I made there's no real use-case any more as far as I'm seeing for allowing people to specify targets as directories in addition to package patterns. Package patterns are strictly more expressive than directories (in the constraint of referencing Go source code), without missing out on any edge-cases that are not unsafe by default.

To explicit that last statement: one could showcase the situation of two Go modules in the same repository, one at the root (A) and one inside it (A/B) and the desire when running at the root to generate a mock in A/mock within module A, for an interface in A/B. It would simply be a case of invoking mockery -dir=./B [...]. However this would be allowing someone to shoot themselves in the foot as the source code in that directory is not necessarily the one used by A at all. Instead A might depend on an entirely different version of A/B with different interfaces instead of using a relative replace statement in it's go.mod.

It's IMO better to require people to only specify package patterns (which may still use relative references inside the same module like ./C/... from the root of A for A/C/...) as it will guarantee that Go's native dependency resolution is respected.

Secondary point: it would considerably simplify the logic that I just rewired. 😉

This would obviously be a breaking change from both an API perspective as well as the CLI point-of view so would need to wait for a v3.

// A file should be passed directly to the package loader as a 'file' query.
if filepath.Ext(pattern) != ".go" {
return fmt.Errorf("specified file %q cannot be parsed as it is not a source file", pattern)
} else if strings.HasPrefix(info.Name(), ".") || strings.HasPrefix(info.Name(), "_") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some existing logic that I took from Walker and moved here. I am not partial to keeping or removing it, I just moved to keep the existing semantics. 🙂

Technically it would be a breaking change to remove this.

@Helcaraxan
Copy link
Contributor Author

Quick and friendly ping on this @LandonTClipp. No worries if you are too busy as I very much understand that other things may take priority. Just did not want it to slip of your radar. 🙂

@LandonTClipp
Copy link
Collaborator

Thanks for the ping, I will not let it slip! Life got in the way (car transmission got totaled, big fun). Also extra busy this week but I will be sure to come back.

@LandonTClipp
Copy link
Collaborator

There have been other places where people suggested moving towards a package-specify model instead of file-specify model, and I'm in total agreement. It's much better to leverage the code already given in package.Load. You are right, however, that it would be a breaking change, which is why we need to wait for v3.

https://github.com/vektra/mockery/projects/3

I'm still in the process of collecting requirements for v3, and also want to finish/incorporate my other project into v3 https://github.com/chigopher/pathlib.

@LandonTClipp LandonTClipp merged commit 6d9ebaa into vektra:master Aug 13, 2020
@LandonTClipp
Copy link
Collaborator

LandonTClipp commented Aug 13, 2020

Unfortunately it looks like this has a major bug in it:

12 Aug 20 22:07 CDT DBG Loading files in directory ".git/objects/0f". dry-run=false version=0.0.0-dev
12 Aug 20 22:07 CDT DBG parsing dry-run=false version=0.0.0-dev
^C
[ltclipp@landon-virtualbox pathlib]$ 

It wants to iterate over dot-dirs, so we have a regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants