Skip to content

Commit

Permalink
build: reduce linux release binary size by 87%
Browse files Browse the repository at this point in the history
Our Linux release binary was hilariously large, weighing in at nearly
800MB (!). Nearly all of the bloat was from DWARF debug info:

    $ bloaty materialized -n 10
        FILE SIZE        VM SIZE
     --------------  --------------
      24.5%   194Mi   0.0%       0    .debug_info
      24.1%   191Mi   0.0%       0    .debug_loc
      13.8%   109Mi   0.0%       0    .debug_pubtypes
      10.1%  79.9Mi   0.0%       0    .debug_pubnames
       8.8%  70.0Mi   0.0%       0    .debug_str
       8.3%  66.3Mi   0.0%       0    .debug_ranges
       4.4%  35.3Mi   0.0%       0    .debug_line
       3.1%  24.8Mi  66.3%  24.8Mi    .text
       1.8%  14.4Mi  25.1%  9.39Mi    [41 Others]
       0.6%  4.79Mi   0.0%       0    .strtab
       0.4%  3.22Mi   8.6%  3.22Mi    .eh_frame
     100.0%   793Mi 100.0%  37.4Mi    TOTAL

This patch gets a handle on this by attacking the problem
from several angles:

  1. We instruct the linker to compress debug info sections. Most of the
     debug info is redundant and compresses exceptionally well. Part of
     the reason we didn't notice the issue is because our Docker images
     and gzipped tarballs were relatively small (~150MB).

  2. We strip out the unnecessary `.debug_pubnames` and `.debug_pubtypes`
     sections from the binary. This works around a known Rust bug
     (rust-lang/rust#46034).

  3. We ask Rust to generate less debug info for release builds,
     limiting it to line info. This is enough information to symbolicate
     a backtrace, but not enough information to run an interactive
     debugger. This is usually the right tradeoff for a release build.

    $ bloaty materialized -n 10
        FILE SIZE       VM SIZE
     --------------   --------------
      33.8%  31.9Mi     0.0%       0  .debug_info
      26.5%  25.0Mi    70.5%  25.0Mi  .text
       8.0%  7.54Mi     0.0%       0  .debug_str
       6.7%  6.36Mi     0.0%       0  .debug_line
       5.7%  5.36Mi     9.4%  3.33Mi  [38 Others]
       5.0%  4.71Mi     0.0%       0  .strtab
       3.8%  3.55Mi     0.0%       0  .debug_ranges
       3.3%  3.11Mi     8.8%  3.11Mi  .eh_frame
       3.0%  2.87Mi     0.0%       0  .symtab
       2.2%  2.12Mi     6.0%  2.12Mi  .rodata
       2.0%  1.92Mi     5.4%  1.92Mi  .gcc_except_table
     100.0%  94.4Mi   100.0%  35.5Mi  TOTAL

One issue remains unsolved, which is that Rust/LLVM cannot currently
garbage collect DWARF that refers to unused symbols/types. The actual
symbols get cut from the binary, but their debug info remains. Follow
rust-lang/rust#56068 and LLVM D74169 [0] if curious. I tested with the
aforementioned lld patch and the resulting binary is even small, at
71MB, so there's another 25MB of savings to be had there. (That patch on
its own, without the other changes, cuts the ~800MB binary to a ~300MB
binary, so it's an impressive piece of work. Unfortunately it also
increases link time by 15-25x.)

[0]: https://reviews.llvm.org/D74169
  • Loading branch information
benesch committed Apr 18, 2020
1 parent 5af08de commit 7eff853
Show file tree
Hide file tree
Showing 5 changed files with 46 additions and 9 deletions.
8 changes: 8 additions & 0 deletions .cargo/config
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[target."x86_64-unknown-linux-gnu"]
# Compressing debug information can yield hundreds of megabytes of savings.
# The Rust toolchain does not currently perform dead code elimination on
# debug info.
#
# See: https://github.com/rust-lang/rust/issues/56068
# See: https://reviews.llvm.org/D74169#1990180
rustflags = ["-C", "link-arg=-Wl,--compress-debug-sections=zlib-gabi"]
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@

target
miri-target
/.cargo
.mtrlz.log
**/*.rs.bk
.netlify
Expand Down
8 changes: 7 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,13 @@ members = [
]

[profile.release]
debug = true
# Emit only the line info tables, not full debug info, in release builds, to
# substantially reduce the size of the debug info. Line info tables are enough
# to symbolicate a backtrace, but not enough to use a debugger interactively.
# This seems to be the right tradeoff for release builds: it's unlikely we're
# going to get interactive access to a debugger in production installations, but
# we still want useful crash reports.
debug = 1

[patch.crates-io]
# Waiting on a release with this commit:
Expand Down
1 change: 1 addition & 0 deletions bin/lint
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ copyright_files=$(grep -vE \
-e '(^|/)\.gitmodules$' \
-e '(^|/)go\.sum$' \
-e '(^|/)Cargo\.toml$' \
-e '^\.cargo/config$' \
-e '^Cargo\.lock$' \
-e '^deny\.toml$' \
-e '^netlify\.toml$' \
Expand Down
37 changes: 30 additions & 7 deletions misc/python/mzbuild.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,11 +98,15 @@ def xcargo_target_dir(root: Path) -> Path:
return root / "target" / "x86_64-unknown-linux-gnu"


def xstrip(root: Path) -> str:
def xbinutil(tool: str) -> str:
if sys.platform == "linux":
return "strip"
return tool
else:
return "x86_64-unknown-linux-gnu-strip"
return f"x86_64-unknown-linux-gnu-{tool}"


xobjcopy = xbinutil("objcopy")
xstrip = xbinutil("strip")


def docker_images() -> Set[str]:
Expand Down Expand Up @@ -157,13 +161,32 @@ def run(self, root: Path, path: Path) -> None:
# down CI, since we're packaging these binaries up into Docker
# images and shipping them around. A bit unfortunate, since it'd be
# nice to have useful backtraces if the binary crashes.
runv([xstrip(root), path / self.bin])
runv([xstrip, path / self.bin])
else:
# Even if we've been asked not to strip the binary, remove the
# `.debug_pubnames` and `.debug_pubtypes` sections. These are just
# indexes that speed up launching a debugger against the binary,
# and we're happy to have slower debugger start up in exchange for
# smaller binaries. Plus the sections have been obsoleted by a
# `.debug_names` section in DWARF 5, and so debugger support for
# `.debug_pubnames`/`.debug_pubtypes` is minimal anyway.
# See: https://github.com/rust-lang/rust/issues/46034
runv(
[
xobjcopy,
"-R",
".debug_pubnames",
"-R",
".debug_pubtypes",
path / self.bin,
]
)

def depends(self, root: Path, path: Path) -> List[bytes]:
# TODO(benesch): this should be much smarter about computing the Rust
# files that actually contribute to this binary target.
return super().depends(root, path) + git_ls_files(
root, "src/**", "Cargo.toml", "Cargo.lock"
root, "src/**", "Cargo.toml", "Cargo.lock", ".cargo"
)


Expand Down Expand Up @@ -216,7 +239,7 @@ def run(self, root: Path, path: Path) -> None:
with open(path / "tests" / "manifest", "w") as manifest:
for (executable, slug, crate_path) in tests:
shutil.copy(executable, path / "tests" / slug)
runv([xstrip(root), path / "tests" / slug])
runv([xstrip, path / "tests" / slug])
manifest.write(f"{slug} {crate_path}\n")
shutil.move(str(path / "testdrive"), path / "tests")
shutil.copy(
Expand All @@ -229,7 +252,7 @@ def depends(self, root: Path, path: Path) -> List[bytes]:
# TODO(benesch): this should be much smarter about computing the Rust
# files that actually contribute to this binary target.
return super().depends(root, path) + git_ls_files(
root, "src/**", "Cargo.toml", "Cargo.lock"
root, "src/**", "Cargo.toml", "Cargo.lock", ".cargo"
)


Expand Down

0 comments on commit 7eff853

Please sign in to comment.