Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant performance degredation #197

Closed
spenserblack opened this issue Sep 26, 2023 · 1 comment · Fixed by #200
Closed

Significant performance degredation #197

spenserblack opened this issue Sep 26, 2023 · 1 comment · Fixed by #200
Assignees
Milestone

Comments

@spenserblack
Copy link
Owner

spenserblack commented Sep 26, 2023

Turns out gengo is once again much slower. Given the changes in #191, the most likely culprit is the Git source. Originally blobs were read and analyzed in parallel. Now they're only analyzed in parallel.

Edit: The biggest time-consumer seems to be calling file_source.overrides 🤔

Ideas

  • Right now a FileSource returns a iterator over tuples of filenames and file contents. Perhaps the iterator should only yield filenames, and provide a method get_contents that takes that filename. That could possibly be easier to parallelize and boost performance.
  • Similar to above, but filename() and contents() methods that take Self::Iter::Item and return the filename and contents.
  • Go with the other idea for implementing multiple sources where they would be receive the analyze_blob function and return results.
    • Implement an overrides method
    • analyze_blob calls self.file_source.overrides
    • analyze passes analyze_blob to self.file_sources.handle
@github-project-automation github-project-automation bot moved this to Triage in Gengo Sep 26, 2023
@spenserblack spenserblack moved this from Triage to Todo in Gengo Sep 26, 2023
@spenserblack spenserblack self-assigned this Sep 26, 2023
@spenserblack spenserblack moved this from Todo to In Progress in Gengo Sep 26, 2023
@spenserblack spenserblack added this to the v1 milestone Sep 26, 2023
@spenserblack
Copy link
Owner Author

Bisect (predictably) points to e84c746

spenserblack added a commit that referenced this issue Sep 27, 2023
This drastically improves performance by no longer cloning the index
state, which only needs to be set once.

Resolves #197
spenserblack added a commit that referenced this issue Sep 27, 2023
This drastically improves performance by no longer cloning the index
state, which only needs to be set once.

Resolves #197
spenserblack added a commit that referenced this issue Sep 27, 2023
This drastically improves performance by no longer cloning the entire
repository index state.

Resolves #197
@github-project-automation github-project-automation bot moved this from In Progress to Done in Gengo Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant