Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce size of gh-pages branch #8059

Open
matschaffer opened this issue Mar 5, 2022 · 4 comments
Open

Reduce size of gh-pages branch #8059

matschaffer opened this issue Mar 5, 2022 · 4 comments

Comments

@matschaffer
Copy link

matschaffer commented Mar 5, 2022

Expected Behavior

A fully functional git clone without a multi-GB download.

Actual Behavior

Full repo clone takes ~2GB

Steps to Reproduce

git clone git@github.com:LLK/scratch-gui.git

There are a couple issues that cover this:

The typical answer is to clone with --depth 1 but this (to my understanding) would leave the clone unable to be used as a development/PR workspace.

(from #5140 (comment))

I was curious so I tried this https://stackoverflow.com/a/42544963/69002

It seems like what's taking up a lot of the space is dependencies being commited to the gh-pages branch. LIke d33ef36 for example.

Those lib.min.js seem to be 15MB-20MB each and get committed a few times a day in a few different subdirectories.

We could eliminate about half of the current repo size by pushing a fresh gh-pages branch with every build.

Something like:

git checkout --orphan gh-pages-${BUILD_NUMBER}
git commit -am 'Rebuild gh-pages'
git push --force origin gh-pages-${BUILD_NUMBER}:gh-pages

Clients will see something like this on their next pull:

 + 03a60aa...8e48d06 gh-pages   -> origin/gh-pages  (forced update)

But this doesn't seem to require intervention. I was even able to commit to gh-pages and the next pull rebased successfully.

Doing this would allow easier clones of the repository, and also probably eliminate a good portion of the 1.5 minute clone time on circleci seen currently.

Alternatively we could move gh-pages to a separate repo, but this would require a bit more coordination for anything using the deployed gh-pages site (possibly https://scratch.mit.edu itself? not sure if it's using gh-pages directly or not).

@matschaffer
Copy link
Author

matschaffer commented Mar 10, 2022

As a test I pushed the develop branch and a squashed gh-pages to https://github.com/matschaffer/scratch-gui-squashed-gh-pages

It's not tiny but the clone size seems to have been cut by about 70% (~1.9GB to ~400MB)

❯ git clone git@github.com:matschaffer/scratch-gui-squashed-gh-pages.git scratch-gui-squashed-clone
Cloning into 'scratch-gui-squashed-clone'...
remote: Enumerating objects: 42971, done.
remote: Counting objects: 100% (1177/1177), done.
remote: Compressing objects: 100% (415/415), done.
remote: Total 42971 (delta 758), reused 1171 (delta 753), pack-reused 41794
Receiving objects: 100% (42971/42971), 404.64 MiB | 4.44 MiB/s, done.
Resolving deltas: 100% (28426/28426), done.

Another option could be moving gh-pages to another repo that would just be used for publishing rather than development.

@matschaffer
Copy link
Author

matschaffer commented Mar 10, 2022

The gh-pages assets still definitely make up the larger blobs of the repo.

  ~/code/LLK/scratch-gui-squashed-clone   develop                                                              07:47:49
❯ git rev-list --objects --all |
  git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' |
  sed -n 's/^blob //p' |
  sort --numeric-sort --key=2 |
  cut -c 1-12,41- |
  $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest | tail -n10
d4188146aeac   13MiB hotfix/totally-normal-2021/lib.min.js
17529bb75d07   14MiB develop/lib.min.js
f01d795250c1   14MiB scratch-desktop/lib.min.js
92adc14f8add   14MiB native/lib.min.js
3abbb626ea8e   14MiB stretchy-paint/lib.min.js
5a51340ed329   15MiB color-swatches/lib.min.js
25deb7569f99   18MiB boost/lib.min.js.map
adf5693bed0f   19MiB boost/lib.min.js
7200d49d2105   19MiB centerCrosshair/lib.min.js.map
13af32bdd340   20MiB centerCrosshair/lib.min.js

There are some larger animated gifs in the source, but they seem to be mostly 1-2MB whereas the sourcemaps and minimized JS files are in the 10-20MB range and there are lots more of those in the published site.

@matschaffer
Copy link
Author

To test the "alternate repo" idea I pushed just develop to another fork and cloned that:

❯ git clone git@github.com:matschaffer/scratch-gui-develop.git --branch develop
Cloning into 'scratch-gui-develop'...
remote: Enumerating objects: 41794, done.
remote: Total 41794 (delta 0), reused 0 (delta 0), pack-reused 41794
Receiving objects: 100% (41794/41794), 306.56 MiB | 270.00 KiB/s, done.
Resolving deltas: 100% (27691/27691), done.

And confirmed that now it's mainly the gif blobs that make up the space:

❯ git rev-list --objects --all |
  git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' |
  sed -n 's/^blob //p' |
  sort --numeric-sort --key=2 |
  cut -c 1-12,41- |
  $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest | tail -n10
1a6fab408778  2.9MiB src/lib/libraries/decks/steps/chase-game-move-randomly.es.gif
641a44ec604d  5.3MiB src/lib/libraries/decks/txt/09_hoc-spin.gif
6370bf50054c  7.5MiB src/lib/libraries/decks/steps/video-pet.es.gif
af14625dd9af  8.6MiB src/lib/libraries/decks/cartoonnetwork/09_cn-level-up-say-something.gif
526d2f10cbf6  9.2MiB src/lib/libraries/decks/cartoonnetwork/06_cn-keep-score.gif
cf67f5a89b9c  9.9MiB src/lib/libraries/decks/steps/video-animate.es.gif
5ea3d171bcef   10MiB src/lib/libraries/decks/cartoonnetwork/07_cn-level-up.gif
e008255c5b80   11MiB src/lib/libraries/decks/steps/video-pop.es.gif
e02d200429b6   11MiB src/lib/libraries/decks/cartoonnetwork/03_cn-glide-around.gif
474c5790d124   12MiB src/lib/libraries/decks/cartoonnetwork/04_cn-collect.gif

@matschaffer
Copy link
Author

I also came across this if we wanted to try other options https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants