Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite the git history to be small #28840

Closed
adeebshihadeh opened this issue Jul 7, 2023 · 3 comments
Closed

Rewrite the git history to be small #28840

adeebshihadeh opened this issue Jul 7, 2023 · 3 comments
Labels
bounty development related to the openpilot development experience enhancement PC Issues related to running openpilot on PC

Comments

@adeebshihadeh
Copy link
Contributor

adeebshihadeh commented Jul 7, 2023

openpilot is currently a >3GB git clone due to not using LFS at the start of the history. master should now be in a good place, including tests to make sure we're not introducing new large files.

We want to rewrite the git history to minimize the repo and clone size. The large files are mostly driving models, Android APKs for the comma two, and third party shared libraries. We want to move all these large files to LFS.

While we're rewriting the repo, we might as well also clean up the master history such that it contains all the release history, starting at 0.1. For context, see this blog post. The full devel history has been truncated to reduce the clone size, but it can be found here https://github.com/commaai/openpilot-release-archive. The first master commit seems to be on v0.7.2, so we'll want to prepend v0.1-v0.7.1 to the front of master.

The deliverable is a script that takes in a copy of the openpilot repo and outputs the new one. The new one should also be pushed to GitHub as a demo. Requirements:

  • repo size is minimal
  • pre-master devel history is prepended, and master now starts at openpilot 0.1
  • new git tags for each openpilot version based on the new commits on master

Open questions

  • how can we minimize the impact?
    • does GitHub handle this somehow?
    • preserve the old commit hash in the message?

See my start at it in #30824

@adeebshihadeh adeebshihadeh added enhancement PC Issues related to running openpilot on PC labels Jul 7, 2023
@ajRiverav
Copy link

ajRiverav commented Jul 23, 2023

consider git lfs

@adeebshihadeh
Copy link
Contributor Author

We've been good about it and other best practices for the last couple years, but there's lot of objects in the history from before then. Also our release branches (devel, release3, etc.) don't use LFS. This issue is for rewriting the history and cleaning up the bloat from releases.

@adeebshihadeh adeebshihadeh changed the title Minimize git repo size Rewrite the git history to be small Jan 30, 2024
@adeebshihadeh adeebshihadeh changed the title Rewrite the git history to be small [$500 bounty] Rewrite the git history to be small Jan 30, 2024
This was referenced Feb 23, 2024
@adeebshihadeh adeebshihadeh added the development related to the openpilot development experience label Jun 30, 2024
@adeebshihadeh adeebshihadeh changed the title [$500 bounty] Rewrite the git history to be small Rewrite the git history to be small Jul 7, 2024
@andiradulescu
Copy link
Contributor

andiradulescu commented Jul 10, 2024

This issue is now completed with #31562 merged and with #32955 soon to be merged.

A couple of things to note:

  • the script doesn't touch branches unrelated to master's history (master-ci, nightly, devel, release3, release3-staging, dashcam3, release2) - so objects in these branches don't get LFS-ied
  • branches related to master's history get rewritten with master's history
  • if the branch fails automatic rewrite, it gets deleted, so the one responsible for the branch can fix the conflicts manually

Keeping this in mind, I did some clone tests with --single-branch on both repos, cloning just one branch to check the size:

Branch Original Size Rewritten Size
master 419.81 MB + LFS* 60.13 MB + LFS*
master-ci 331.80 MB 331.80 MB
nightly 157.96 MB 157.96 MB
devel 286.07 MB 286.07 MB
release3 169.18 MB 169.18 MB
dashcam3 142.35 MiB 142.35 MiB
release2 1.61 MiB + LFS 1.61 MiB + LFS
total (summed) 1508.78 MB 1149.10 MB
total (full clone) 2500 MB** 1700 MB**

* LFS for master is 203.97 MB, same as before
** I'm still investigating why the difference between a full clone and the summed value is that big

The final output repo, I tested with, can be found here: https://github.com/andi-radulescu-contributor/openpilot

As next step, I would see one of two options to reduce the size coming from master-ci, nightly, devel, release3, release3-staging, dashcam3, release2:

  • import big files to LFS also for these - but when something is force-pushed on one of these branches, the previous files needs to be removed from LFS - also when pushing to one of these branches, LFS needs to be used
  • move more of these branches master-ci, nightly, devel, release3, release3-staging, dashcam3, release2 to another repo (e.g. https://github.com/commaai/openpilot-release-archive)

Fortunately this change can be made completely separate from the master rewrite.

Also linking #30028 here since it has some valid points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bounty development related to the openpilot development experience enhancement PC Issues related to running openpilot on PC
Projects
Status: No status
Development

No branches or pull requests

4 participants