Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make continuous integration artifacts always have full debuginfo #73240

Merged
merged 2 commits into from
Apr 28, 2024

Conversation

hexagonrecursion
Copy link
Contributor

@hexagonrecursion hexagonrecursion commented Apr 24, 2024

Summary

None

Purpose of change

  1. When a continuous integration run fails, it uploads an artifact. I have noticed that those artifacts only have partial debug info (-g1) - they contain enough debug info to generate a backtrace, but not enough to view the contents of the variables in gdb. Having full debug info would be very helpful when debugging test failures.
  2. Some builds in matrix.yml had "release: 0", but this was misleading because build-scripts/gha_compile_only.sh was always overriding this with RELEASE=1

Describe the solution

  1. Emit full -g debug symbols in every build - helps debugging failed tests
  2. Keep RELEASE=1 in every build to make tests run faster (trade-off: source level debugging of optimized code does not always work), but make it more clear that we are doing this

Additional context

Note: this is me coming back to #70701 with a fresh perspective

@github-actions github-actions bot added Code: Build Issues regarding different builds and build environments Code: Tooling Tooling that is not part of the main game but is part of the repo. astyled astyled PR, label is assigned by github actions json-styled JSON lint passed, label assigned by github actions labels Apr 24, 2024
@hexagonrecursion
Copy link
Contributor Author

Cataclysm Windows build / Build (pull_request) Failing after 29m - I do not think I touched that part of the build infrastructure in this pull request.

@github-actions github-actions bot added the BasicBuildPassed This PR builds correctly, label assigned by github actions label Apr 24, 2024
@Qrox
Copy link
Contributor

Qrox commented Apr 25, 2024

I think binaries with full debug symbols often reach several GB in size. Does github place a limit on the total size of uploaded artifacts?

hexagonrecursion added a commit to hexagonrecursion/Cataclysm-DDA that referenced this pull request Apr 25, 2024
This is a benchmark to determine the size of artifacts produced by
continuous integration when enabling full debug info.
See CleverRaven#73240
hexagonrecursion added a commit to hexagonrecursion/Cataclysm-DDA that referenced this pull request Apr 25, 2024
This is a benchmark to determine the size of artifacts produced by
continuous integration when enabling full debug info.
See CleverRaven#73240

Note: for the purposes of this benchmark I have disabled a test that fails
too often to (hopefully) get continuous integration to run every build.
@hexagonrecursion
Copy link
Contributor Author

hexagonrecursion commented Apr 25, 2024

I think binaries with full debug symbols often reach several GB in size. Does github place a limit on the total size of uploaded artifacts?

The only binary that matrix.yml uploads is cata_test - it is significantly smaller than the game binary (e.g. 564 MB with full debuginfo). I have created a benchmark to calculate artifact sizes for every build in the matrix - see #73254. Let's wait for the results.

@hexagonrecursion
Copy link
Contributor Author

hexagonrecursion commented Apr 25, 2024

Does github place a limit on the total size of uploaded artifacts?

  • I would assume there has to be a limit - can't let people upload a petabyte of artfacts
  • The limit does not appear to be documented anywhere

hexagonrecursion added a commit to hexagonrecursion/Cataclysm-DDA that referenced this pull request Apr 25, 2024
This is a benchmark to determine the size of artifacts produced by
continuous integration when enabling full debug info.
See CleverRaven#73240

Note: for the purposes of this benchmark I have:
* Enabled continuous integration for draft pull requests
* Disabled "fail fast" in the build matrix
* Enabled artifact upload even if tests pass
hexagonrecursion added a commit to hexagonrecursion/Cataclysm-DDA that referenced this pull request Apr 25, 2024
This is a benchmark to determine the size of artifacts produced by
continuous integration when enabling full debug info.
See CleverRaven#73240

Note: for the purposes of this benchmark I have:
* Enabled continuous integration for draft pull requests
* Disabled "fail fast" in the build matrix
* Enabled artifact upload even if tests pass
hexagonrecursion added a commit to hexagonrecursion/Cataclysm-DDA that referenced this pull request Apr 25, 2024
This is a benchmark to determine the size of artifacts produced by
continuous integration when enabling full debug info.
See CleverRaven#73240

Note: for the purposes of this benchmark I have:
* Enabled continuous integration for draft pull requests
* Disabled "fail fast" in the build matrix
* Enabled artifact upload even if tests pass
* Skip the tests - just upload the artifact - I only want to know its
  size
* Use matrix title as artifact name to prevent collisions
hexagonrecursion added a commit to hexagonrecursion/Cataclysm-DDA that referenced this pull request Apr 25, 2024
This is a benchmark to determine the size of artifacts produced by
continuous integration when enabling full debug info.
See CleverRaven#73240

Note: for the purposes of this benchmark I have:
* Enabled continuous integration for draft pull requests
* Disabled "fail fast" in the build matrix
* Enabled artifact upload even if tests pass
* Skip the tests - just upload the artifact - I only want to know its
  size
* Use matrix title as artifact name to prevent collisions
hexagonrecursion added a commit to hexagonrecursion/Cataclysm-DDA that referenced this pull request Apr 25, 2024
This is a benchmark to determine the size of artifacts produced by
continuous integration when enabling full debug info.
See CleverRaven#73240

Note: for the purposes of this benchmark I have:
* Enabled continuous integration for draft pull requests
* Disabled "fail fast" in the build matrix
* Enabled artifact upload even if tests pass
* Skip the tests - just upload the artifact - I only want to know its
  size
* Use matrix title as artifact name to prevent collisions
@hexagonrecursion
Copy link
Contributor Author

@Qrox

Here is some data on the compressed size of the artifacts:

  • Basic Build and Test (Clang 10, Ubuntu, Curses) 325 MB
  • Clang 12, Ubuntu, Tiles, ASan 418 MB
  • Clang 14, macOS 12, Tiles, Sound, x64 and arm64 Universal Binary 44 MB
  • GCC 11, Ubuntu, Curses, ASan 616 MB
  • GCC 12, Ubuntu cross-compile to MinGW-Win64, Tiles, Sound 708 MB
  • GCC 9, Curses, LTO 564 MB
  1. I assume you meant a single binary can reach several GB in size - this does not appear to be the case here - the worst case observed: 708 MB
  2. Note: due to matrix.yml (unlike my benchmark) using the same artifact name for each build variant only one artifact will be uploaded per run - subsequent uploads of an artifact with the same name from the same run will fail

I still can't find any official documentation on what the limit on artifact storage is. There does appear to be a limit (or perhaps there was earlier)

@Qrox
Copy link
Contributor

Qrox commented Apr 26, 2024

I assume you meant a single binary can reach several GB in size - this does not appear to be the case here - the worst case observed: 708 MB

I remember the unit test binary reaching several GB in size in my local build using MinGW-w64, but I do not have the binary to check at the moment. In any case, 300-700 MB is still large enough that we should probably care about the size limit. The issue actions/upload-artifact#9 you mentioned suggests that there is a way to configure automatical removal of old artifacts, so we can probably try that, but the size increase may still result in fewer artifacts being retained.

@Maleclypse
Copy link
Member

I assume you meant a single binary can reach several GB in size - this does not appear to be the case here - the worst case observed: 708 MB

I remember the unit test binary reaching several GB in size in my local build using MinGW-w64, but I do not have the binary to check at the moment. In any case, 300-700 MB is still large enough that we should probably care about the size limit. The issue actions/upload-artifact#9 you mentioned suggests that there is a way to configure automatical removal of old artifacts, so we can probably try that, but the size increase may still result in fewer artifacts being retained.

https://backrightup.com/blog/github-storage-limits/#:~:text=3.-,What%20are%20GitHub's%20size%20limits%3F,minimize%20performance%20impact%20on%20GitHub.

This says that individual files with a size larger than 50MB will receive a warning from github. I get that artifacts are different than hosted code files and this may be outdated information but it was the first thing I found when searching.

@hexagonrecursion
Copy link
Contributor Author

hexagonrecursion commented Apr 26, 2024

@Maleclypse

https://backrightup.com/blog/github-storage-limits/#:~:text=3.-,What%20are%20GitHub's%20size%20limits%3F,minimize%20performance%20impact%20on%20GitHub

GitHub limits the maximum file size (or sizes) you can add to your repository to 50 MB.

File sizes larger than 50 MB will get you a warning from Git.

https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-large-files-on-github#file-size-limits

GitHub limits the size of files allowed in repositories. If you attempt to add or update a file that is larger than 50 MiB, you will receive a warning from Git.

This appears to only refer to files that are checked into a git repo - not build artifacts or releases.

@hexagonrecursion
Copy link
Contributor Author

@Qrox What artifact retention period do you think would be sufficient? Note: default apparently depends on repository settings and I have no way to see that

@Qrox
Copy link
Contributor

Qrox commented Apr 26, 2024

If we only care about testing the binary while a PR is still open, a period larger than a few days might be sufficient. However, there are ~100 general matrix builds in the past 24 hours according to the github action log, which if all fail and upload one artifact per build at 500 MB, will require 250 GB of space in a 5-day period. If the suggestion in actions/upload-artifact#9 that the repo size limit applies to artifacts is true, 250 GB is probably still too high.

@hexagonrecursion
Copy link
Contributor Author

@Qrox

If we only care about testing the binary while a PR is still open, a period larger than a few days might be sufficient. However, there are ~100 general matrix builds in the past 24 hours according to the github action log, which if all fail and upload one artifact per build at 500 MB, will require 250 GB of space in a 5-day period. If the suggestion in actions/upload-artifact#9 that the repo size limit applies to artifacts is true, 250 GB is probably still too high.

Assuming I understand the result of the following API query correctly we are likely currently storing over 600GB of cata_test artifacts. Note: this estimate does assume that the average size of the last 30 with the name "cata_test" is representative of all 7754 such artifacts.

# Prints 623725812165.0667
curl --silent \
-L \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
https://api.github.com/repos/CleverRaven/Cataclysm-DDA/actions/artifacts?name=cata_test \
| jq '.total_count *  ( [.artifacts[].size_in_bytes] | add / length )'

The default artifact retention period appears to be 90 days:

# Prints 89.92314814814814
curl --silent \
-L \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
https://api.github.com/repos/CleverRaven/Cataclysm-DDA/actions/artifacts?name=cata_test \
| jq '.artifacts[0] | (.expires_at | fromdateiso8601) - (.created_at | fromdateiso8601) | . / 60 / 60 / 24'

@hexagonrecursion
Copy link
Contributor Author

Here is a calculation using more complete data (about 50% of currently stored cata_test artifacts). ~200 megabytes per artifact times 7754 is abut 1550 gigabytes - by the revised estimate we are currently storing 1.5 terabytes of cata_test artifacts.

# Prints 214150401.89666668
curl --silent \
-L \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
'https://api.github.com/repos/CleverRaven/Cataclysm-DDA/actions/artifacts?name=cata_test&per_page=100&page=[1-78:2]' \
| jq -n '[inputs | .artifacts[].size_in_bytes] | add / length '

@hexagonrecursion
Copy link
Contributor Author

Using REST API with authentication I was able to obtain the exact number of bytes consumed by all our cata_test artifacts not counting any other artifacts ~1.65 terabytes:

# Prints 1652631375350
curl --silent \
-L \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
-H "Authorization: Bearer <<<REDACTED>>>" \
'https://api.github.com/repos/CleverRaven/Cataclysm-DDA/actions/artifacts?name=cata_test&per_page=100&page=[1-100]' \
| jq -n '[inputs | .artifacts[].size_in_bytes] | add'

@hexagonrecursion
Copy link
Contributor Author

hexagonrecursion commented Apr 26, 2024

@Qrox I think github will thank us if we cut the usage down to 250 GB by using shorter expiration times. What do you think? Did I misinterpret something?

@Qrox
Copy link
Contributor

Qrox commented Apr 26, 2024

I think github will thank us if we cut the usage down to 250 GB by using shorter expiration times

The binaries with debug info is ~10 times larger than normal binaries, so we can keep the total size unchanged by reducing the expiration time to around 9 days. That does not leave much room for further reduction though.

Do we currently upload the binaries compressed? I recall that the debug info is uncompressed by default, or it is in my local build at least, so we can probably make the binaries smaller using compression.

@hexagonrecursion
Copy link
Contributor Author

@Qrox The artifacts are already stored as compressed zip archives

Here is an example

  • The UI says 44 MB
  • The API says 46145486 ~46 MB ~44MiB
  • The zip archive is 46145486 bytes
  • The executable inside is 168285848 bytes - effective compression ratio ~3.6

@hexagonrecursion
Copy link
Contributor Author

I have reduced the artifact retention period from 90 days to 9 as suggested

@Maleclypse Maleclypse merged commit f401475 into CleverRaven:master Apr 28, 2024
14 of 20 checks passed
@hexagonrecursion hexagonrecursion deleted the always_debug_info branch April 28, 2024 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
astyled astyled PR, label is assigned by github actions BasicBuildPassed This PR builds correctly, label assigned by github actions Code: Build Issues regarding different builds and build environments Code: Tooling Tooling that is not part of the main game but is part of the repo. json-styled JSON lint passed, label assigned by github actions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants