-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce the final binary size #4436
Conversation
This change increases compile times and removes optimizations provided by loop vectorization, so there are some drawbacks. Disabling incremental compiling is especially of note. |
@zleyyij Hi, thanks for you comment. Compared to GNU coreutils which is 10 MiB i guess this is huge disadvantage for uutils to install 20MiB of the "close-to-original-featureset" binary |
Your disagreement here is interesting, because both perspectives make sense. I think we can cater to both use cases by providing multiple profiles. Maybe a I find the most interesting changes regarding binary size are more in the direction of removing dependencies and those kinds of things.
This is not really an issue on release builds is it? Anyway, a separate profile would solve this too. Let's get into the specific changes though.
There's no For reference, I was also looking at this: https://github.com/johnthagen/min-sized-rust |
There's also some previous discussion in this issue: #747 |
Well this means 80% of people will not notice it, because by default they will compile as a "release" target
I'm building coreutils with "-Oz" and it works like expected on production systems. In case of embedded systems those couple of KiB/MiB saved space means a lot. Anyways that comparision of different opt-levels should give what kind of outcome ? I can assure that "-O3" i not going to bring any significant speed improvements nor big differences on size
LLVM/clang offers ThinLTO. Thin gives best outcome for embedded systems compared to full or fat. Especially fat here makes not sense to keep intermediate code and normally compiled code, unless someone like bloatware.
|
Maybe, but this depends on documentation too. And we could communicate this with distros that are targeted at embedded systems so they can enable it by default.
I understand, but that's a specific use case. I don't think the default configuration of uutils should be for embedded.
I'd like to know what tradeoff were making exactly. The problem with setting multiple parameters at the same time (like you do in this PR) is that it's hard to guess the impact of each individual setting.
Could you clarify this part? |
Btw, I'm trying to run some different combinations of these settings myself at the moment. I'll report back in a bit. |
Embedded is one case. Imagine the user experience, so in case of uutils you get twice bigger binary and less feature-completed-set compared to GNU coreutils. This project advertises itself as a replacement for coreutils, so let's give less excuses for GNU coreutils to be used.
Well guess this is explained here |
I can't find what you're talking about. What I understand from that page is that there's a "fat" object with all the info necessary for compiling that is kept around. But after compilation that can be thrown away, right? So I don't see how that's bad for embedded. |
GNU testsuite comparison:
|
So, I've run my experiment. I ran the compilation for each combination of the following parameters:
It took a while, because that's 54 combinations 😄 Unsurprisingly, the best combination is:
That's just 5.6MB! So that's exciting. For fun, here's the worst combination:
Click here to see all sizes
But, more interestingly, I've analyzed the average change in binary size due to each of these parameters compared to the "baseline", which is the option that generates the largest binary (e.g.
Some conclusions from this experiment:
I think
Which gives us the following size:
And then we make a custom profile Do you agree? |
I like that |
Hi,
That was obvious result, based on what does the FAT mode for LTO.
Sounds like a glitch, or rust is doing some unpredicted builds outputs compared to LLVM/clang :)
Strange as above, as "-Oz" is LLVM/clang specific and does more aggressive size optimizations than widely known "-Os". There could be two things here, either rust does some unpredicted builds outputs or "-Oz" optimization level is somehow broken on LLVM/clang. I remember from the past that we at OpenMandriva Lx set up "-Oz" as default optimization level after we did the switch to LLVM/clang in 2015 as default compiler :)
|
looking at the CI, it moves from:
=>
|
That's because the PR matches roughly these settings:
@tpgxyz Was this difference intentional? And if so, what is your reasoning for using these settings?
I think
Yeah it is strange. Luckily the differences between One thing that we should maybe think about before we commit to this is bug reports. If we strip the binary to the absolute minimum by default (no debuginfo, no panic unwind), then the bug reports we receive will probably not contain a lot of useful information. We would have to ask people to install a debug version and run that to get a proper backtrace, which is a bit cumbersome. |
Welcome to the Linux distributions world then. I'd say like 80% of distributions do ship software stripped of debuginfo in a form of
When uutils version of coreutils will be widely adopted by distributions than you should expect that you will not get the issue reports with an huge ratio of debuginfo details, as end users does not install these and if they do then running gdb is somehow a hurdle. So it is up to you to be ready before wide adoption, as i assume that is the goal to replace GNU coreutils in near future :) |
That's true, but that's not an entirely convincing argument to strip by default. Distributions can already do whatever they prefer and I would actually encourage them to strip the binary. But people installing the uutils/coreutils via Also, it's worth noting that this project is very much a work in progress, so we'd expect more bug reports than say after a 1.0 release. So, it could be an option to wait with stripping the debuginfo until we're more stable. Note that I'm not really disagreeing with you, I just wanted us to consider this before committing to stripping the binary. And I want the opinions of other maintainers as well (@sylvestre what do you think?). Other projects don't do a lot of these size optimizations themselves either.
And there are some issues on their repo with very similar discussion to this one:
Also, coming back to what you said before:
To improve this case, at least for distributions, we could include a page on packaging uutils/coreutils in the online docs (e.g. how to strip the binary if we don't do that by default, what features to include, what the package should be called, etc.). |
I think we should match what other binaries are doing |
Yes, I think I'll open another PR with a the default like ripgrep and a separate profile for small releases and include some documentation for package maintainers. |
do we still want to do something here? |
Looks like no, as different aproach was chosen to achieve same goal |
Hi, looks like release binary can be a little bit smaller.
Comapred to GNu coreutils compiled with corresponding CFLAGS/LDFLAGS and same LLVM toolchain 15.0.7