Significant performance degredation #197

spenserblack · 2023-09-26T17:41:09Z

Turns out gengo is once again much slower. Given the changes in #191, the most likely culprit is the Git source. Originally blobs were read and analyzed in parallel. Now they're only analyzed in parallel.

Edit: The biggest time-consumer seems to be calling file_source.overrides 🤔

Ideas

Right now a FileSource returns a iterator over tuples of filenames and file contents. Perhaps the iterator should only yield filenames, and provide a method get_contents that takes that filename. That could possibly be easier to parallelize and boost performance.
Similar to above, but filename() and contents() methods that take Self::Iter::Item and return the filename and contents.
Go with the other idea for implementing multiple sources where they would be receive the analyze_blob function and return results.
- Implement an overrides method
- analyze_blob calls self.file_source.overrides
- analyze passes analyze_blob to self.file_sources.handle

The text was updated successfully, but these errors were encountered:

spenserblack · 2023-09-27T14:23:07Z

Bisect (predictably) points to e84c746

This drastically improves performance by no longer cloning the index state, which only needs to be set once. Resolves #197

This drastically improves performance by no longer cloning the entire repository index state. Resolves #197

github-project-automation bot added this to Gengo Sep 26, 2023

github-project-automation bot moved this to Triage in Gengo Sep 26, 2023

spenserblack moved this from Triage to Todo in Gengo Sep 26, 2023

spenserblack self-assigned this Sep 26, 2023

spenserblack moved this from Todo to In Progress in Gengo Sep 26, 2023

spenserblack mentioned this issue Sep 26, 2023

Improve performance #198

Closed

spenserblack added this to the v1 milestone Sep 26, 2023

spenserblack mentioned this issue Sep 27, 2023

Change Gengo to take a generic FileSource #191

Merged

3 tasks

spenserblack added a commit that referenced this issue Sep 27, 2023

Stop cloning the index state

1ba5444

This drastically improves performance by no longer cloning the index state, which only needs to be set once. Resolves #197

spenserblack mentioned this issue Sep 27, 2023

Stop cloning the index state #200

Merged

spenserblack added a commit that referenced this issue Sep 27, 2023

Stop cloning the index state

1f4af1b

This drastically improves performance by no longer cloning the index state, which only needs to be set once. Resolves #197

spenserblack closed this as completed in #200 Sep 27, 2023

spenserblack added a commit that referenced this issue Sep 27, 2023

Stop cloning the index state (#200)

6be20ff

This drastically improves performance by no longer cloning the entire repository index state. Resolves #197

github-project-automation bot moved this from In Progress to Done in Gengo Sep 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant performance degredation #197

Significant performance degredation #197

spenserblack commented Sep 26, 2023 •

edited

Loading

spenserblack commented Sep 27, 2023

Significant performance degredation #197

Significant performance degredation #197

Comments

spenserblack commented Sep 26, 2023 • edited Loading

Ideas

spenserblack commented Sep 27, 2023

spenserblack commented Sep 26, 2023 •

edited

Loading