Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-16121] Increasingly excessive memory requirements for linking on Linux #58380

Closed
gwynne opened this issue Apr 16, 2022 · 16 comments · Fixed by #58800 or #64312
Closed

[SR-16121] Increasingly excessive memory requirements for linking on Linux #58380

gwynne opened this issue Apr 16, 2022 · 16 comments · Fixed by #58800 or #64312
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler itself legacy driver Area → compiler: the integrated C++ legacy driver. Succeeded by the swift-driver project Linux Platform: Linux performance swift-autolink-extract Area → compiler → legacy driver: the 'swift-autolink-extract' mode

Comments

@gwynne
Copy link
Contributor

gwynne commented Apr 16, 2022

Environment

$ sw_vers
ProductName:    macOS
ProductVersion: 12.3.1
BuildVersion:   21E258
$ xcodebuild -version
Xcode 13.3.1
Build version 13E500a
$ docker --version
Docker version 20.10.14, build a224086

Description

With each subsequent release of Swift, more and more available RAM is required to link Swift executables on Linux. The addition of the recommended --static-swift-stdlib build flag on Linux (not to mention the accepted evolution proposal to make this the default, SE-0342) exacerbates the issue to the point where 4GB is not enough RAM to successfully link an unmodified Vapor template app (the following snippet assumes the vapor Homebrew formula is installed):

$ docker system info
<snip>
 Kernel Version: 5.10.104-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 3.826GiB
<snip>
$ vapor new testmem
Cloning template...
name: testmem
Would you like to use Fluent? (--fluent/--no-fluent)
y/n> y
fluent: Yes
db: Postgres (Recommended)
Would you like to use Leaf? (--leaf/--no-leaf)
y/n> n
leaf: No
<snip>
$ cd testmem/
testmem$ docker build .
[+] Building 123.6s (17/23)
<snip>
 => ERROR [build  7/12] RUN swift build -c release --static-swift-stdlib                                              89.3s
------
 > [build  7/12] RUN swift build -c release --static-swift-stdlib:
#17 0.878 Building for production...
#17 0.970 remark: Incremental compilation has been disabled: it is not compatible with whole module optimizationremark: Incremental compilation has been disabled: it is not compatible with whole module optimizationremark: Incremental compilation has been disabled: it is not compatible with whole module optimizationremark: Incremental compilation has been disabled: it is not compatible with whole module optimization[1/867] Compiling _NIODataStructures Heap.swift
#17 1.561 remark: Incremental compilation has been disabled: it is not compatible with whole module optimization[3/869] Compiling COperatingSystem Exports.swift
<snip>
#17 58.78 [931/932] Compiling AsyncHTTPClient HTTPClient+execute.swift
#17 58.80 remark: Incremental compilation has been disabled: it is not compatible with whole module optimization[933/934] Compiling Vapor Application.swift
#17 74.23 remark: Incremental compilation has been disabled: it is not compatible with whole module optimization[935/936] Compiling Fluent FluentProvider+Concurrency.swift
#17 75.22 remark: Incremental compilation has been disabled: it is not compatible with whole module optimization[937/938] Compiling App TodoController.swift
#17 76.49 remark: Incremental compilation has been disabled: it is not compatible with whole module optimization[939/940] Compiling Run main.swift
#17 89.04 error: link command failed with exit code 254 (use -v to see invocation)
#17 89.04 clang-13: error: unable to execute command: Killed
#17 89.04 clang-13: error: linker command failed due to signal (use -v to see invocation)
#17 89.11 [940/941] Linking Run
------
executor failed running [/bin/sh -c swift build -c release --static-swift-stdlib]: exit code: 1
$

Note: Issue content cleaned up from original Jira import.

Detail from JIRA

Data Value
Previous ID SR-16121
Radar None
Original Reporter @gwynne
Type Bug
Votes 0
Component/s Compiler, Source Tooling
Labels Bug, Linux
Assignee None
Priority Medium

md5: 05c2ceaa59cb789e6c5b9eabe39cb06a

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@gwynne
Copy link
Contributor Author

gwynne commented Apr 27, 2022

Here is a build log from a user who was attempting to build on DigitalOcean’s App Platform: build_log_out_of_memory.txt

@keith
Copy link
Member

keith commented Apr 27, 2022

Does using lld help this? (I don't think that's the default)

@fredriss
Copy link
Contributor

Can you check what the size of the linker inputs is? If this is a regression, can you reproduce a working environment and compare the sizes?

@fredriss
Copy link
Contributor

fredriss commented May 5, 2022

I managed to reproduce the memory usage. In my case, the link step would use 11Gb of memory.

Finding the size of the inputs is not trivial given how hidden the build itself is. With some effort I extracted the link line. The linker is taking only one argument which is a response file. In the invocation that was created in my case, the response file contains over 18000 arguments, most of them object files and libraries.

The link line has this structure:

ld.gold [...] -o /output/file [...] -Lxxx -Lyyy [...] --start-group  <all object files and libraries to link>  --end-group

It turns out the all object files and libraries to link is around 18000 items. Even though the template app is not small, it doesn't link 18000 files and libraries. It turns out the libraries have thousands of duplicates. For example -lswiftCore is passed 1058 times.

Manually deduplicating the response file, you can get it down to less than 2000 arguments. Using this new response file, it links way faster and used 500Mb of memory.

It seems like SwiftPM should not add duplicate libraries inside of the --start-group --end-group. Because of the link group, this shouldn't change the semantics (or at the very least, not cause any unresolved symbol errors).

@tomerd
Copy link
Contributor

tomerd commented May 6, 2022

interesting find, thanks allot @fredriss

@abertelrud @neonichu thoughts?

@tomerd
Copy link
Contributor

tomerd commented May 6, 2022

this may actually be happening in Driver and not SwiftPM, I think the relevant code is in Driver's GenericUnixToolchain

      // If we are linking statically, we need to add all
      // dependencies to a library search group to resolve
      // potential circular dependencies
      if staticStdlib || staticExecutable {
        commandLine.appendFlag(.Xlinker)
        commandLine.appendFlag("--start-group")
      }

      let inputFiles: [Job.ArgTemplate] = inputs.compactMap { input in
        // Autolink inputs are handled specially
        if input.type == .autolink {
          return .responseFilePath(input.file)
        } else if input.type == .object {
          return .path(input.file)
        } else if lto != nil && input.type == .llvmBitcode {
          return .path(input.file)
        } else {
          return nil
        }
      }
      commandLine.append(contentsOf: inputFiles)

      if staticStdlib || staticExecutable {
        commandLine.appendFlag(.Xlinker)
        commandLine.appendFlag("--end-group")
      }

@artemcm?

@artemcm
Copy link
Contributor

artemcm commented May 9, 2022

In the above code-block inputs comes from build-planning and consists of largely the build's own Object files. One exception to which seems to be .autolink jobs' outputs which are producing a response file.

I can see a scenario where we expand this response file (to generate a different, top-level linker response file?) and its contents get us all of the duplicated -l flags. I'll need to grab a Linux box and take a look.

@artemcm
Copy link
Contributor

artemcm commented May 10, 2022

Reproduced this with a Swift 5.6.1 toolchain on a test Vapor project on Ubuntu 20.04.
I can see that the linker invocation gets passed as input:

/home/ac/Vapor/toolbox/testmem/.build/x86_64-unknown-linux-gnu/release/App.build/Controllers/Run.autolink

Which contains, for example, 1057 occurrences of -lswiftCore:

grep -o 'swiftCore' /home/ac/Vapor/toolbox/testmem/.build/x86_64-unknown-linux-gnu/release/App.build/Controllers/Run.autolink | wc -l
1057

According to the log, this file is produced by an swift-autolink-extract invocation slightly earlier:

/home/ac/SwiftToolchain/swift-5.6.1-RELEASE-ubuntu20.04/usr/bin/swift-autolink-extract /home/ac/Vapor/toolbox/testmem/.build/x86_64-unknown-linux-gnu/release/Run.build/main.swift.o -o /home/ac/Vapor/toolbox/testmem/.build/x86_64-unknown-linux-gnu/release/Run.build/Run.autolink

But, the interesting part is that when I manually run this exact swift-autolink-extract command, I get a result with just 17 entries (which I think is the correct # we expect without any duplication). So we need to figure out why this same command produces a result that duplicates output 1057 times when run as part of the build.

@artemcm
Copy link
Contributor

artemcm commented May 10, 2022

Ah, no, I misread the log. The autolink-extract invocation above that produces 17 results is not the one whose result gets used: release/Run.build/Run.autolink, whereas the one with so many entries is release/App.build/Controllers/Run.autolink. The latter (problematic) response file is produced by a different extract invocation in the build:

/home/ac/SwiftToolchain/swift-5.6.1-RELEASE-ubuntu20.04/usr/bin/swift-autolink-extract @/tmp/TemporaryDirectory.4s9CLu/arguments-3872688424222014945.resp

The inputs to this invocation are all object files involved in the build, of which there are 1927. And this is the invocation that consistently outputs an absurd output response file. My guess is that swift-autolink-extract just isn't deduplicating its output across all of its inputs. So when we have this many inputs, we get this scenario.

The solution here is to teach swift-autolink-extract to deduplicate its output across inputs.

@gwynne
Copy link
Contributor Author

gwynne commented May 10, 2022

Would also be nice if we could teach the linker to deduplicate its inputs a bit.

@fredriss
Copy link
Contributor

(We don't control the Linux linker, but even if we did:) Technically the linker cannot deduplicate. The order of inputs is significant and putting a library twice in non-consecutive spots has a certain meaning, even inside of a group.

@gwynne
Copy link
Contributor Author

gwynne commented May 11, 2022

I know, but even given the absurdity of the input, the linker really ought to be able to do better than going pathological complexity in both space and time 😕

artemcm added a commit to artemcm/swift that referenced this issue May 12, 2022
Otherwise we can duplicate linker flags across input binaries, which can result in very large linkerr invocation commands.

Resolves swiftlang#58380
artemcm added a commit to artemcm/swift that referenced this issue May 13, 2022
Otherwise we can duplicate linker flags across input binaries, which can result in very large linkerr invocation commands.

Resolves swiftlang#58380
artemcm added a commit to artemcm/swift that referenced this issue May 13, 2022
Otherwise we can duplicate linker flags across input binaries, which can result in very large linkerr invocation commands.

Resolves swiftlang#58380
artemcm added a commit to artemcm/swift that referenced this issue May 19, 2022
Otherwise we can duplicate linker flags across input binaries, which can result in very large linkerr invocation commands.

Resolves swiftlang#58380
artemcm added a commit to artemcm/swift that referenced this issue May 19, 2022
Otherwise we can duplicate linker flags across input binaries, which can result in very large linkerr invocation commands.

Resolves swiftlang#58380
artemcm added a commit to artemcm/swift that referenced this issue May 20, 2022
Otherwise we can duplicate linker flags across input binaries, which can result in very large linkerr invocation commands.

Resolves swiftlang#58380
artemcm added a commit to artemcm/swift that referenced this issue May 20, 2022
Otherwise we can duplicate linker flags across input binaries, which can result in very large linkerr invocation commands.

Resolves swiftlang#58380
artemcm added a commit to artemcm/swift that referenced this issue May 23, 2022
Otherwise we can duplicate linker flags across input binaries, which can result in very large linkerr invocation commands.

Resolves swiftlang#58380
artemcm added a commit to artemcm/swift that referenced this issue May 23, 2022
Otherwise we can duplicate linker flags across input binaries, which can result in very large linkerr invocation commands.

Resolves swiftlang#58380
artemcm added a commit to artemcm/swift that referenced this issue May 24, 2022
artemcm added a commit to artemcm/swift that referenced this issue May 25, 2022
@artemcm
Copy link
Contributor

artemcm commented Jun 3, 2022

In my experiments #59115 reduces memory consumption when linking a sample Vapor app by ~2GB. This still leaves memory consumption to be quite high so this issue is worth keeping open to track further improvements.

@gwynne
Copy link
Contributor Author

gwynne commented Mar 11, 2023

An update:

I did some builds using both the 5.8 and main nightlies, using a project generated from Vapor's default template (run vapor new --no-commit --no-git --no-leaf --no-fluent LinkTest to generate the same project) and adding the -Xlinker --stats flag to the build to make ld.gold report statistics. Running in Docker (latest Docker Desktop for Mac, on an M1 running latest Ventura, Docker configured with 8 CPUs & 24GB RAM), and using a bind mount. Results for 5.8 and main were so similar that I've included only the former here.

~/LinkTest$ docker run --pull -ti --privileged -e 'TERM=xterm-256color' \
    --mount type=bind,source=$(pwd),target=/src -w/src \
    swiftlang/swift:nightly-5.8-jammy
 ################################################################
 #								#
 # Swift Nightly Docker Image					#
 # Tag: swift-5.8-DEVELOPMENT-SNAPSHOT-2023-02-23-a			#
 #								#
 ################################################################
root@168d96cf79e1:/src# swift --version
Swift version 5.8-dev (LLVM 44d4f9d4b49845f, Swift b9562e1a860ec0b)
Target: aarch64-unknown-linux-gnu

debug (swift build -c debug -Xlinker --stats):

/usr/bin/ld.gold: total space allocated by malloc: 167821312 bytes (~160.1 MiB)
/usr/bin/ld.gold: total bytes mapped for read: 6932794562 (~6.5 GiB)
/usr/bin/ld.gold: maximum bytes mapped for read at one time: 6932794562 (~6.5 GiB)
/usr/bin/ld.gold: archive libraries: 2219
/usr/bin/ld.gold: total archive members: 904
/usr/bin/ld.gold: loaded archive members: 2

release (swift build -c release --static-swift-stdlib -Xlinker --stats):

/usr/bin/ld.gold: total space allocated by malloc: 2020044800 bytes (~1.88 GiB)
/usr/bin/ld.gold: total bytes mapped for read: 28915308646 (~26.93 GiB) 
/usr/bin/ld.gold: maximum bytes mapped for read at one time: 28915308646 (~26.93 GiB)
/usr/bin/ld.gold: archive libraries: 17092
/usr/bin/ld.gold: total archive members: 713050
/usr/bin/ld.gold: loaded archive members: 711

I took the response file from the release build and manually deduplicated the -l flags for the following libraries:

  • -lswiftGlibc (appears 738x)
  • -lm, -lpthread, -lutil, -ldl (appear 739x each)
  • -lswiftDispatch, -ldispatch, -lDispatchStubs, -lBlocksRuntime (appear 723x each)
  • -lFoundation, -lCoreFoundation, -licui18nswift, -licuucswift, -licudataswift, -luuid (appear 449x each)

... for a total of 9,265 duplicates removed. I then re-ran the link command with the new response file. The resulting binary executable was identical in byte size, and examination of both versions with llvm-objdump showed no meaningful differences; both also functioned identically and correctly when run normally. As for the effect on linker performance...

/usr/bin/ld.gold: total space allocated by malloc: 202473472 bytes (~193.1 MiB)
/usr/bin/ld.gold: total bytes mapped for read: 375074262 (~357.7 MiB)
/usr/bin/ld.gold: maximum bytes mapped for read at one time: 46575504 (~44.2 MiB)
/usr/bin/ld.gold: archive libraries: 42
/usr/bin/ld.gold: total archive members: 2676
/usr/bin/ld.gold: loaded archive members: 711

So that's an order of magnitude increase in space efficiency (not to mention a 5.5x improvement in time efficiency), a nearly-equivalent decrease in IOPS, and no apparent drawbacks. It seems a foregone conclusion (at least to me) that it's worth seeing if making swift-autolink-extract a little (more) smarter works out. AIUI, it should a matter of just adding the appropriate names to this map in autolink_extract_main.cpp.

(I realize it may not be possible/safe to generically deduplicate platform libraries like libpthread, libm, etc., but just deduping libswiftGlibc, libBlocksRuntime, and the three libdispatch libraries would take care of over 1/3 of the repeats. Tack on the [Core]Foundation and ICU libs and that's almost 3/4. Even if the effect doesn't scale in linear proportion, anything helps. In the much larger project that inspired this ticket in the first place, the linker's RAM usage hits almost 20GiB and takes over 2 minutes to run, and what happens to the compiler's performance for that project in release builds is another ticket altogether...)

@tomerd
Copy link
Contributor

tomerd commented Mar 12, 2023

cc @compnerd @artemcm who looked at this last time

@tomerd
Copy link
Contributor

tomerd commented Mar 12, 2023

also cc @MaxDesiatov @etcwilde @al45tair

@AnthonyLatsis AnthonyLatsis added legacy driver Area → compiler: the integrated C++ legacy driver. Succeeded by the swift-driver project swift-autolink-extract Area → compiler → legacy driver: the 'swift-autolink-extract' mode performance labels Mar 13, 2023
MaxDesiatov added a commit that referenced this issue Jun 1, 2023
[5.8] CMake: fix missing `SWIFT_CONCURRENCY_GLOBAL_EXECUTOR`

Explanation: Resolves issues with static linking on Linux
Risk: Medium, affects Linux builds and top-level CMake declarations.
Original PRs: #65795 and #64312 for `main`, #65824 and #64633 for `release/5.9`
Reviewed by: @al45tair @drexin @etcwilde 
Resolves: some of the issues reported in #65097, also resolves #58380
Tests: Added in swiftlang/swift-integration-tests#118

`SWIFT_CONCURRENCY_GLOBAL_EXECUTOR` is defined in `stdlib/cmake/modules/StdlibOptions.cmake`, which is not included during the first pass of evaluation of the root `CMakeLists.txt`. It is available on subsequent evaluations after the value is stored in CMake cache. This led to subtle bugs, where `usr/lib/swift_static/linux/static-stdlib-args.lnk` didn't contain certain flags on clean toolchain builds, but did contain them in incremental builds.

Not having these autolinking flags in toolchain builds leads to errors when statically linking executables on Linux.

Additionally, since our trivial lit tests previously didn't link Dispatch statically, they didn't expose a bug where `%import-static-libdispatch` substitution had a missing value. To fix that I had to update `lit.cfg` and clean up some of the related path computations to infer a correct substitution value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler itself legacy driver Area → compiler: the integrated C++ legacy driver. Succeeded by the swift-driver project Linux Platform: Linux performance swift-autolink-extract Area → compiler → legacy driver: the 'swift-autolink-extract' mode
Projects
None yet
6 participants