50Gb repo #687

blackliner · 2022-02-04T03:55:11Z

My org is currently using Jenkins, and our monorepo is huge (above 50Gb on HEAD). On Jenkins, we utilize git reference feature to limit bandwith usage. We are now evaluating different SaaS CI providers and this seems to be one of the big show-stoppers for most. So:

What is the canonical approach on github actions for such extreme cases?
Do (custom) runners support local references? (Enterprise use-case)
Alternatives?

References

#22 is one issue I found, but I think that was just about fetch depth. Does not help, because even with a depth of 1, we would have to clone 50Gb.

blackliner · 2022-02-05T06:13:54Z

something like https://circleci.com/docs/2.0/caching/#source-caching maybe?

ingomueller-net · 2023-06-12T12:22:19Z

I have implemented caching of the .git folder for a similar case and described it here. Note that in my case the repository was small but a submodule was large. I suppose that this is an easier case since the revision of the submodule only changes rarely. In contrast, pretty much every CI run will have a require a different revision to be checked out, and I suspect that that doesn't combine well with depth-one checkouts. Also, the 50GB you mention may be above the 10GB limit of the cache size.

ddompe mentioned this issue Jul 3, 2023

Add support for reference repository parameter #1400

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

50Gb repo #687

50Gb repo #687

blackliner commented Feb 4, 2022

blackliner commented Feb 5, 2022

ingomueller-net commented Jun 12, 2023

50Gb repo #687

50Gb repo #687

Comments

blackliner commented Feb 4, 2022

References

blackliner commented Feb 5, 2022

ingomueller-net commented Jun 12, 2023