Do not use SIMD instructions on i686 #31110

ranma42 · 2016-01-22T13:58:54Z

SIMD instructions are not available on all of i686 processors and
cause programs to terminate on illegal instruction on older
processors.

Clang defaults to compiling for pentium4 when targeting a generic 32 bits x86 architecture. This was used as a precedent for rustc, but I believe some details were missed when doing so:

when compiled for the plain i686 target, Clang does not use any SIMD operation (it uses x87 FPU operations)
when compiling for the plain i686 target, Clang does not emit any SIMD operation

I think we should ensure that it is actually possible to target plain i686 with rust.
I expect this to come up with distros which target i686.

This should fix #14441 (so far I only verified with objdump that SIMD instructions are not used).

SIMD instructions are not available on all of i686 processors and cause programs to terminate on illegal instruction on older processors.

rust-highfive · 2016-01-22T13:59:07Z

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

MagaTailor · 2016-01-22T15:08:24Z

Clearly you weren't willing to understand the issue and went ahead with this blunt PR.

Even though it was I who kept bugging the team about ISA compatibility, I never dreamed of getting everyone off an SSE2 optimized rlib set so I'm definitely going to play the devil's advocate here!
If the team are interested in a nod towards older x86 architectures I have the right solution ready, feel free to ask!

I'll address the purely technical part then:

Next time llvm changes the i686 target definition to something else, it's going to be back to square one. pentiumpro would have been the explicitly correct choice.
The PR doesn't address the correct CFLAGS/CXXFLAGS during jemalloc and llvm builds which will, in certain circumstances, produce "broken" builds (i.e. still containing SSE2 instructions)

hanna-kruppe · 2016-01-22T16:17:45Z

While "i686" is the wrong name for the target currently so-called (32 bit but with reasonably modern instruction set extensions), that target is important and should be the default. SSE2 is very wildly available and has several benefits (smaller and more efficient copying of mid-sized structs, more predictable and generally more accurate float arithmetic, the ability to benefit from autovectorization, and possibly more that I'm forgetting). Certainly a pre-P4 target should exist, but I don't think it should be the default. Perhaps the current "i686" target should be renamed, but this is tricky and will require a long grace window before flipping the switch on the old ("i686") name.

MagaTailor · 2016-01-22T16:33:10Z

@brson
The difference in performance between generic i686 and i586 rlibs would be negligible, therefore a new i586-unknown-linux-gnu target would probably be the preferred solution here. (not what I had in mind though)

In conjunction with adjustable -C target-cpu= it would bring the performance of compiled code in line with what's being proposed here but at the same time could officially settle Debian/Slackware/whatever compatibility & suitability.

ranma42 · 2016-01-22T16:37:08Z

@petevine I would be very surprised if LLVM changed the definition of i686. Let me state it again: LLVM/Clang does not assume that i686 is pentium4 or better, this is an erroneous assumption that was done on the Rust side.

What is the problem with a non-SSE2 compiler/library?
Did anybody observe unacceptable regressions?
I am afraid the change to pentium4 was done without evaluating the performance impact.
If there is a major performance difference, I would expect rustc to also ship with an SSE2-optimized LLVM backend.
(Otherwise, what is the reasoning behind a SSE2 rustc frontend and an x87 LLVM backend?)

@rkruppe In #14441 I also suggested an alternative, that is splitting the 32-bits x86 targets in i686 (for plain i686 aka ppro+) and i786 (aka pentium4+). Do you think that would be a more convenient choice?
Also, why do you expect the need for a long grace window? The change I am proposing is backwards compatible (at least in terms of ABI).

MagaTailor · 2016-01-22T16:50:48Z

Hell, what fun arguing the opposite! There were no erroneous assumptions and no clang factor so you've created a strawman.

Theirs was a conscious decision on the rust team's part, one that I personally didn't like for the sole reason of not being able to build from source on older machines (stage0 snapshots are P4 too).

And the fact neither the downloads page, nor the rust build system warn you about it!
(that's the minimum other projects do in polite society)

ranma42 · 2016-01-22T16:53:41Z

@petevine I was under the assumption that rustc was trying to imitate the behaviour of Clang based on the commit message in 296c74d

hanna-kruppe · 2016-01-22T16:56:35Z

@petevine I find it much more exhausting to read your rather heated replies than arguing this topic, and I don't like the topic very much. I have no moderation duties or powers, I'm just asking as another lowly contributor: Please be more charitable.

@ranma42 As you point out, there is no serious evaluation of the performance (that I'm aware of). But I do see good reasons to assume there will be measurable performance impact on several kinds of code for reasons outlined above. The reason I am suggesting a long-ish grace window (by that I mean a release cycle or two) is twofold:

I wouldn't want to suddenly thrust these performance regressions on people, but rather give them time to switch, and
More importantly, changes to tooling and filenames (e.g. release artifact names, build targets, build directories, ...) can have significant fallout. See Change name when outputting staticlibs on Windows #29520 as a vaguely related example (though related to the names of rustc outputs).

nikomatsakis · 2016-01-22T17:04:51Z

I don't feel qualified to review or not review this PR. @alexcrichton or @brson seem like more logical choices. Would any of you like to volunteer?

MagaTailor · 2016-01-22T17:06:24Z

@rkruppe My replies in this thread are not heated at all - what are you talking about? Anyway, I have nothing to add as I'd already considered the matter settled a long time ago.

@ranma42 You got your dates wrong, the issue was there before that commit.

ranma42 · 2016-01-22T17:08:36Z

@rkruppe The worry about possible performance regression is justified (most vectorisation opportunities would be lost). In general purpose code (like rustc) I would expect minor changes, but vectorisation-intensive libraries (something like BLAS) would probably show some changes. I will try to do some benchmarking on the rust build itself (compiler + libraries) and on the shootout (and I would be willing to try other benchmarks, if anybody can suggest some that would be particularly significant).

I would expect that this change should not have a visible fallout except on tools relying on the specific opcodes emitted by rustc... but it's better safe than sorry, so this is definitely a change whose impact I would like to discuss, evaluate and test extensively before it is applied.

brson · 2016-01-22T18:59:21Z

Thanks for the PR @ranma42. This is a tricky question that keeps coming up. There are a few factors at play that I'm aware of.

The default i686 architecture uses instructions that weren't always available on i686
Distros build for all i686 so our default doesn't work for them
The performance impact of using i686 vs pentium4 is not clear
We don't have a convenient way to set cpu settings from cargo (though there's an upcoming patch to teach Cargo RUSTFLAGS that could help).
We've discussed adding an i586-unknown-linux-gnu target but decided against based on this reasoning
Changing the behavior for this triple has unknown downstream impact

One thing I'm not clear on: Distros use compatible i686 code for the binaries they distribute but does gcc emit better code when run by users?

@anguslee @sylvestre how is Debian now getting around this problem of rustc i686-unknown-linux-gnu using the wrong defaults for your system?

Regardless of whether we change this triple it seems the need to tweak cpu settings comes up often enough that there should be a more convenient way to do it from Cargo (cc @alexcrichton).

cc @dotdash since you touched this last.

alexcrichton · 2016-01-22T20:41:49Z

@brson yeah I definitely agree that this sort of minor configuration tweak comes up quite often. I think that your RUSTFLAGS patch is probably the best place to start for something like this and we can take it from there if it turns out to not be sufficient.

I've never really known what i686 means, but if it means an older CPU than what we're currently choosing to lots of people than it seems reasonable to move it back. It'd be great if we could point to a canonical document everyone agrees on saying "this is exactly what i686 is" as we could double-check future decisions against that.

ranma42 · 2016-01-22T21:53:46Z

Regarding i586 (vs i686 and i786), an additional data point is the meaning of those triples for other compilers and toolchains.
http://git.savannah.gnu.org/gitweb/?p=autoconf.git;a=blob;f=build-aux/config.sub#l967 seems to indicate that at least for autotools i586 stands for pentium, i686 for pentium{pro,2,3}, i786 for pentium4. Additional investigation might be needed, but I would definitely try to align the meaning to the established one.

ranma42 · 2016-01-23T01:48:23Z

I put some of the outputs of GCC and Clang (versions 4.8.4 and 3.4, from an Ubuntu LTS 14.04.3 VM) for different options in this gist (nb: the versions I tested are quite old, it might make sense to check the latest gcc and Clang compilers, but I had this VM ready for testing)

Clang defaults to generating code for more modern processors, but with -march=i686 it does not generate SIMD instructions.

GCC seems more conservative about the instruction set used by default (no SIMD) and even with -march=pentium4 it will keep using the FPU instructions unless explicitly instructed to do otherwise with -mfpmath=sse. The documentation https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html states that using the last option should result in faster code, but "may break some existing code that expects temporaries to be 80 bits", which is probably the reason for this behaviour.

@brson "Does gcc emit better code when run by users?"
Apparently gcc does not emit better code unless users make an explicit effort to set the target. This might be as simple as adding -march=native, but it does not do it by default.

dotdash · 2016-01-24T18:28:12Z

AFAICT, rustc behaves just like clang. Using -march=i686 with clang is like using -Ctarget-cpu=i686 with rustc. The major difference you're observing is probably that your base system is compiled to be i686-compatible, while rust's distribution currently comes with libraries that require at least a P4-class CPU. Bootstrapping with gcc and RUSTC_FLAGS=-Ctarget-cpu=i686 should result in a compiler and libraries suitable for i686 CPUs, as long as you build your crates with -Ctarget-cpu=i686 as well.

The observation in #14441 that the LLVM binaries don't use SSE instructions likely stems from the fact that LLVM was compiled with gcc, which does default to i686 compatibility. Building LLVM with clang results in code that does use SSE instructions.

Also, on my Debian system, and and libc++ from the i386 pool contains SSE instructions, so a rustc built against that wouldn't work on a non-SSE machine either.

Given all that, I think a step in the right direction here would be to have a less obscure way than using RUSTC_FLAGS to bootstrap a truly i686-compatible rustc, which also takes care of passing the rigth arguments to the other components that are being built, like LLVM, jemalloc and whatever else there is.

Whether or not we want to provide a "true" i686 rustc built, either as the default or in addition to what we have now, I don't know. If someone could do some benchmarks, that would be nice. If nobody volunteers, maybe I can do it sometime next week.

I think it would be mostly interesting to have rustc bootstrapped for i686, but the benchmark built with a P4 target CPU, to actually see how much performance is lost by having the distribution target older CPUs as a baseline.

@petevine you said that you have a solution ready, could you elaborate on that? I probably missed a number of details here.

MagaTailor · 2016-01-24T19:18:13Z

@dotdash I wasn't going to bother but yours is the first fully competent post in this thread so I must oblige!

The solution, considering there are probably fewer than a dozen people still using non-SSE2 machines and Rust (myself included), should be a source-only one:

configure should accept --disable-SSE2 and -SSE options (it's a false dichotomy that's being put forth here) - leading to P3 and PPro rust codegen, respectively
correct configuration of the C/C++ parts through -march flags
downloading of a separate official stage0 snapshot or making the snapshot itself generic

Apart from that, even if the status quo remains, the downloads page should make it clear SSE2 is required and in case someone starts a naive source build, the snapshot should get tested immediately and not after 5 hours of building LLVM.

I tried Golang not long ago and that's how you bootstrap an i386 (387 option) compiler for yourself (the official distro defaults to SSE2).

steveklabnik · 2016-01-24T19:38:21Z

wasn't going to bother but yours is the first fully competent post in this thread

When @pnkfelix was calling you out for being rude, this is what he was talking about.

I know that you are frustrated. Please stop taking it out on others. It's not appropriate here.

MagaTailor · 2016-01-24T21:55:23Z

You're probably right - the jab wasn't necessary. Apologies everyone!

Once again, great post @dotdash! and @nikomatsakis, that's some wisdom straight from Pirkei Avot!

ranma42 · 2016-01-24T23:52:51Z

@dotdash I confirm that rebuilding Rust (and its LLVM repo) on the same machine with Clang results in LLVM binaries which use SSE. We might want to ensure that bootstrapping from gcc and Clang results in equivalent binaries (from the point of view of target instruction set). As you mentioned, it would be very convenient if there was a way to pass the appropriate flags to all of the components. It would make it easier to support older targets, but it would also be useful if somebody wanted to sacrifice compatibility in order to make use of the newest operations available on its machine (basically by building everything with -march=native).

I installed Debian jessie i386 the libc++1 package includes SIMD instructions, but other binaries, including libstdc++ and Clang seem to use x87 instructions and no SIMD operation at all. I wonder if this is intentional or just a consequence of the fact that the libc++1 package is built with clang.

brson · 2016-01-27T02:16:11Z

Given all that, I think a step in the right direction here would be to have a less obscure way than using RUSTC_FLAGS to bootstrap a truly i686-compatible rustc, which also takes care of passing the rigth arguments to the other components that are being built, like LLVM, jemalloc and whatever else there is.

Isn't this not just a problem of bootstrapping though? The compiled rustc is going to proceed to output unusable binaries when used afterward.

brson · 2016-01-27T02:35:33Z

configure should accept --disable-SSE2 and -SSE options (it's a false dichotomy that's being put forth here) - leading to P3 and PPro rust codegen, respectively

Presumably this would actually permanently override the target spec for i686 targets, so that any time you compiled with one one would get no SSE2 instructions.

Using -march=i686 with clang is like using -Ctarget-cpu=i686 with rustc

@dotdash I think there is a difference though, in that rustc understands target-triples (I'm guessing gcc/clang don't accept them directly but not sure), and when you pass an 'i686' triple to rustc you might reasonably expect it to behave like -C target-cpu=i686.

MagaTailor · 2016-01-27T08:05:32Z

@brson Indeed, the idea is to leave SSE (1) on the table for the most usable cpus, namely P3's and Athlons, without having to resort to -C (after a corresponding bootstrap).

BTW, the fastest possible 32-bit code (even on a P4) would probably be achieved this way in LLVM:
-C target-cpu=i486 -C target-feature=+sse2

Pentium 4 was a very specific, one-off architecture and it's completely improbable any tool outside of Intel can actually do a good job optimizing for it. So, if you're interested in getting a tiny bit of speedup out of the default 32-bit Rust distribution, that's one idea due to smaller code size.

dotdash · 2016-01-27T10:14:21Z

Given all that, I think a step in the right direction here would be to have a less obscure way than using RUSTC_FLAGS to bootstrap a truly i686-compatible rustc, which also takes care of passing the rigth arguments to the other components that are being built, like LLVM, jemalloc and whatever else there is.

Isn't this not just a problem of bootstrapping though? The compiled rustc is going to proceed to output unusable binaries when used afterward.

The compiled rustc will produce usable binaries when used with an appropriate -Ctarget-cpu option. Unlike the current state of things where that doesn't even help because the bundled libs aren't usable.

Using -march=i686 with clang is like using -Ctarget-cpu=i686 with rustc

@dotdash I think there is a difference though, in that rustc understands target-triples (I'm guessing gcc/clang don't accept them directly but not sure), and when you pass an 'i686' triple to rustc you might reasonably expect it to behave like -C target-cpu=i686.

Using clang --target i686-unknown-linux-gnu makes it emit 32bit code targetted at P4 class CPUs, adding -march=i686 changes that to use a plain i686 as the baseline. That's the same result you get with rustc --target i686-unknown-linux-gnu and adding -Ctarget-cpu=i686.

I'm not saying that this is necessarily the right way to do things, but given how old those CPUs are, defaulting to a more "recent" set of features and requiring developers targetting old hardware to explicitly specify their target seems like a reasonable choice to me.

brson · 2016-02-10T21:28:51Z

I suggest we do this:

Leave the definition of i686-* alone, targeting pentium4+ processors, on the precedent of clang.
Add i586-* targets for those that need the target specs. This means they won't have to add their own custom specs or patch the source to change the i686 targets.
Don't make any changes to the binaries we publish (yet), nor publish new i586 bins. Those that need this functionality can build their own i586 compiler on a pentium4+ machine.

alexcrichton · 2016-02-11T00:52:22Z

Ah yes the tools team discussed this during triage and came to the conclusions @brson mentioned. @ranma42 would you be ok making this change? It'd probably involve adding a mk/cfg/*.mk file, adding src/librustc_back/target/i586_unknown_linux_gnu.rs, and... I think that may be it?

ranma42 · 2016-02-11T12:30:00Z

@alexcrichton Why just src/librustc_back/target/i586_unknown_linux_gnu.rs? Wouldn't src/librustc_back/target/i586_linux_android.rs, src/librustc_back/target/i586_pc_windows_gnu.rs, and so on be needed too?

@brson I am afraid that such naming might be misleading. If I understood it correctly, you are suggesting that the i586-* targets in rustc will behave just like the corresponding i686-* targets except for the target CPU (-march setting in clang) being i686. This disagrees both with the GNU/autotools convention and with the one proposed in Clang. Even though there seems to be limited agreement on what is the minimum CPU associated with each triple, making a i586-* triple target the i686 cpu does not seem intuitive and AFAICT has no precedent. Specifically, Clang treats the i586-unknown-linux-gnu triple just like i686-unknown-linux-gnu, i.e. it defaults to generating code for pentium4.

I think the Clang defaults are not particularly good choices and I would rather use the more conservative (and compatible) defaults of the GNU compilers/toolchain. For the record, even Clang follows the gcc conventions in some cases, such as Android, while rustc targets pentium4 for this triple.

In addition to compatibility concerns, keeping rustc in sync with Clang requires more effort than just using the most generic CPU for the architecture (see the current Clang logic for just x86 here).
The behaviour of rustc and Clang is already diverging, as Clang has a different default target CPU for x86 triples depending on the OS (even older than i686 on *BSD and Haiku!), while it looks like rustc always defaults to pentium4 except on Darwin.

If GNU conventions (use most generic CPU for target arch) are considered impractical, I would try to go for the ones proposed by Clang. If changing the existing targets is unfeasible, neither GNU nor Clang triples can be used and we are bound to define a new (and incompatible, rustc-specific) set of triples. This strikes me as a very inconvenient choice.

In any case, I would at least try to ensure that a way to build a non-SSE version (or whatever is needed by distros) is well-known and tested. This is desirable anyway if distros start packaging rust for older versions of the x86 arch.

Additionally, I would love if there was some more information about the behaviour of rustc (there is some, but it is spread through commit messages, internal/users forums and github issues/PR).
When I tried to find out why pentium4 was it used in the first place, the best answer I could find was "Clang does, so rustc should". This is consistent with using yonah as the target CPU for i686-apple-darwin, but it seems contradicted in other cases where pentium4 was chosen for consistency with i686-unknown-linux-gnu even though Clang does otherwise.
Would you be ok with a PR that adds comments to rustc target definitions for the rationale of these choices and compares the defaults with gcc/Clang? I can try to write it, but I will likely need some help to collect the information.

MagaTailor · 2016-02-11T14:48:22Z

@alexcrichton @brson
Pretty obvious but mod.rs will need to be updated too and consequently a new stage0 snapshot published.

Once there, to have the ecosystem ready, a few additions like this one alexcrichton/curl-rust@a1e76ec will be necessary.

From my experience with the new armv7 target, here's a shortlist of affected crates:
curl-rust, git2-rs, ssh2-rs, glutin

brson · 2016-02-11T19:06:48Z

If I understood it correctly, you are suggesting that the i586-* targets in rustc will behave just like the corresponding i686-* targets except for the target CPU (-march setting in clang) being i686. This disagrees both with the GNU/autotools convention and with the one proposed in Clang. Even though there seems to be limited agreement on what is the minimum CPU associated with each triple, making a i586-* triple target the i686 cpu does not seem intuitive and AFAICT has no precedent.

@ranma42 OK, good points. Is there another solution you like other than removing sse from the i686 targets that allows people to create compilers to target true i686es without patching the source?

Pretty obvious but mod.rs will need to be updated too and consequently a new stage0 snapshot published.

@petevine How does mod.rs need to be updated, and which mod.rs?

Right now @alexcrichton and I do not want to change the code generation of snapshots. Mostly because we're moving away from snapshots for bootstrapping and toward official releases. We're hoping that those that need these compilers, like distros, will be ok with building them themselves from a machine that can run sse.

MagaTailor · 2016-02-11T19:29:35Z

@brson
I was merely talking about adding the new i586 target to the mod.rs in src/librustc_back/target and releasing a new stage0 snapshot which would include it.

There's definitely going to be no problem/slowdown using the i586 compiler (and rlibs) in tandem with -C target-cpu= on i686 so this solution is completely satisfactory.

alexcrichton · 2016-02-12T00:33:40Z

@ranma42 I was thinking that for now we can probably just add i586-unknown-linux-gnu instead of any of the others as it's the only one being asked for. The other triples could likely be added on demand as well if necessary.

From what you're saying, though, the difference of i586-unknown-linux-gnu between what we're proposing is rustc's interpretation and clang's current one is unfortunate. Is there any triple which enables code generation for 32-bit x86 and disables SSE isntructions by default with clang? If so, we could presumably just use that, I think that i586 is only being mentioned here as it was mentioned elsewhere.

Also, with regards to mirroring clang and where all this came from, you're definitely more than welcome to add some documentation! I suspect the workflow for the initial integration of this change look like:

Realization that Clang defaults to pentium4 on i686-unknown-linux-gnu instead of i686 like rustc did
Another realization that Clang has a different default for OSX
Blanket apply all defaults to all known targets at the time.
As future targets were added, Clang was not consulted and all information was copy/pasted from existing targets.

Which I think may help explain why our defaults may differ from Clang in a few places (but they probably shouldn't). Does that make sense?

alexcrichton · 2016-02-12T00:34:43Z

@petevine yes to officially support a new triple like this we would need to produce both nightly and snapshot compilers, but we currently don't produce nightlies beyond tier 1 platforms (which this wouldn't be initially), so we would probably support community-bulit snapshots/nightlies in the near future for any new target added.

MagaTailor · 2016-02-12T00:50:47Z

@alexcrichton
If you're not going to update the snapshot (no changes beyond incorporating the new target list), the 3rd point by @brson is not going to be possible w/o a json target spec. (or a fun hack I'd used originally)

ranma42 · 2016-02-12T10:47:51Z

@ranma42 OK, good points. Is there another solution you like other than removing sse from the i686 targets that allows people to create compilers to target true i686es without patching the source?

Sure :) I like @dotdash suggestion of having a unified way (configure argument?) to pass the appropriate flags when building each component. Actually, it looks like a good idea independently from this issue.

From what you're saying, though, the difference of i586-unknown-linux-gnu between what we're proposing is rustc's interpretation and clang's current one is unfortunate. Is there any triple which enables code generation for 32-bit x86 and disables SSE isntructions by default with clang? If so, we could presumably just use that, I think that i586 is only being mentioned here as it was mentioned elsewhere.

Among those supported by rust, i686-apple-darwin and i686-unknown-freebsd:

Triple	Clang target CPU	rustc target CPU
i386-apple-ios	yonah	generic (i686?)
i686-apple-darwin	i686	yonah
i686-linux-android	i686+ssse3	pentium4
i686-pc-windows-gnu	pentium4	pentium4
i686-pc-windows-msvc	pentium4	pentium4
i686-unknown-dragonfly	pentium4	pentium4
i686-unknown-freebsd	i486	pentium4
i686-unknown-linux-gnu	pentium4	pentium4

The data for clang has been collected running clang -target $i -### -S empty.c with the various target triples. The data for rustc has been extracted from the target definitions. As already mentioned, the default CPU for i686-linux-android was downgraded in Clang from core2 to i686, so depending on the version you have on your system, this might be an expected mismatch.

Also, with regards to mirroring clang and where all this came from, you're definitely more than welcome to add some documentation!

I will start by adding comments in the code which provide the same information I collected in the table above. Providing this information to users through something like "-###" would be awesome, but it certainly involves much more significant changes.

Which I think may help explain why our defaults may differ from Clang in a few places (but they probably shouldn't). Does that make sense?

Yes, that looks like what happened.
Should I close this PR and make a new one to replicate Clang defaults?
This would affect i386-apple-ios, i686-apple-darwin, i686-linux-android, and i686-unknown-freebsd, while i686-unknown-linux-gnu would be left untouched (even though @alexcrichton mentioned that it is the only one people is currently complaining about).
What about non-x86 architectures?

alexcrichton · 2016-02-12T19:00:43Z

I'd be fine merging a PR to align all our targets with whatever the Clang default are (e.g. fix the discrepancies you've found here). Unfortunately that still doesn't quite solve the inital problem in this PR (there's no target for no-sse instructions), but I guess if we soup up the build system somehow we could fix that.

ranma42 · 2016-02-13T17:21:25Z

Closing as the actionable items are now handled in other PRs:

Add a new i586 Linux target #31629 adds the i586 target
Document and update i686 triples #31632 synchronises the existing i686 targets with Clang

jayaddison · 2023-03-27T13:23:30Z

Hi @dotdash - in reply to this:

AFAICT, rustc behaves just like clang. Using -march=i686 with clang is like using -Ctarget-cpu=i686 with rustc. The major difference you're observing is probably that your base system is compiled to be i686-compatible, while rust's distribution currently comes with libraries that require at least a P4-class CPU. Bootstrapping with gcc and RUSTC_FLAGS=-Ctarget-cpu=i686 should result in a compiler and libraries suitable for i686 CPUs, as long as you build your crates with -Ctarget-cpu=i686 as well.

I've attempted exporting RUSTC_FLAGS=-Ctarget-cpu=i686 during a rebuild of Debian's rustc package for i686 (not cross-compiling; from within an i386 environment), but it doesn't appear to have had the intended effect.

Note: that was using LLVM during the build (you mention gcc, although I would hope that the stage1 rustc can be built with either gcc or llvm?).

Could you double-check whether RUSTC_FLAGS is the way to achieve this? (I'm looking into it here - I've been grepping the source for RUSTC_FLAGS and haven't found many references to it, outside of a few tests)

dotdash · 2023-03-27T14:08:06Z

@jayaddison You probably also need to set CFLAGS/CXXFLAGS so that clang also targets the right architecture. Sorry, it's been a while since I worked on this and at the moment, I don't have the time to look into this in detail.

jayaddison · 2023-03-27T15:35:03Z

@dotdash That's OK, and I'll check the compiler flags - thank you for the quick response.

jayaddison · 2023-03-31T10:55:32Z

Hi @dotdash - in reply to this:

AFAICT, rustc behaves just like clang. Using -march=i686 with clang is like using -Ctarget-cpu=i686 with rustc. The major difference you're observing is probably that your base system is compiled to be i686-compatible, while rust's distribution currently comes with libraries that require at least a P4-class CPU. Bootstrapping with gcc and RUSTC_FLAGS=-Ctarget-cpu=i686 should result in a compiler and libraries suitable for i686 CPUs, as long as you build your crates with -Ctarget-cpu=i686 as well.

I've attempted exporting RUSTC_FLAGS=-Ctarget-cpu=i686 during a rebuild of Debian's rustc package for i686 (not cross-compiling; from within an i386 environment), but it doesn't appear to have had the intended effect.

Note: that was using LLVM during the build (you mention gcc, although I would hope that the stage1 rustc can be built with either gcc or llvm?).

Could you double-check whether RUSTC_FLAGS is the way to achieve this? (I'm looking into it here - I've been grepping the source for RUSTC_FLAGS and haven't found many references to it, outside of a few tests)

Possibly adding noise, but for the record: I've been able to produce the build results I was looking for by reducing the i686-unknown-linux-gnu CPU target spec from pentium4 to i686.

(Debian itself has previously patched that down to pentiumpro)

Do not use SIMD instructions on i686

0a0a063

SIMD instructions are not available on all of i686 processors and cause programs to terminate on illegal instruction on older processors.

rust-highfive assigned nikomatsakis Jan 22, 2016

ranma42 mentioned this pull request Jan 22, 2016

rustc crashing over illegal instruction #14441

Closed

brson assigned brson and unassigned nikomatsakis Jan 22, 2016

brson mentioned this pull request Jan 22, 2016

default flexible targets path to /etc/rustc/ #31117

Closed

alexcrichton added the T-tools label Jan 27, 2016

MagaTailor mentioned this pull request Feb 9, 2016

x86(64) runtime performance irregularities #31503

Closed

ranma42 mentioned this pull request Feb 13, 2016

Document and update i686 triples #31632

Closed

ranma42 closed this Feb 13, 2016

ranma42 mentioned this pull request Apr 27, 2016

--target should ignore the machine part of the triplet in most cases #33147

Open

Do not use SIMD instructions on i686 #31110

Do not use SIMD instructions on i686 #31110

Conversation

ranma42 commented Jan 22, 2016

rust-highfive commented Jan 22, 2016

MagaTailor commented Jan 22, 2016

hanna-kruppe commented Jan 22, 2016

MagaTailor commented Jan 22, 2016

ranma42 commented Jan 22, 2016

MagaTailor commented Jan 22, 2016

ranma42 commented Jan 22, 2016

hanna-kruppe commented Jan 22, 2016

nikomatsakis commented Jan 22, 2016

MagaTailor commented Jan 22, 2016

ranma42 commented Jan 22, 2016

brson commented Jan 22, 2016

alexcrichton commented Jan 22, 2016

ranma42 commented Jan 22, 2016

ranma42 commented Jan 23, 2016

dotdash commented Jan 24, 2016

MagaTailor commented Jan 24, 2016

steveklabnik commented Jan 24, 2016

MagaTailor commented Jan 24, 2016

ranma42 commented Jan 24, 2016

brson commented Jan 27, 2016

brson commented Jan 27, 2016

MagaTailor commented Jan 27, 2016

dotdash commented Jan 27, 2016

brson commented Feb 10, 2016

alexcrichton commented Feb 11, 2016

ranma42 commented Feb 11, 2016

MagaTailor commented Feb 11, 2016

brson commented Feb 11, 2016

MagaTailor commented Feb 11, 2016

alexcrichton commented Feb 12, 2016

alexcrichton commented Feb 12, 2016

MagaTailor commented Feb 12, 2016

ranma42 commented Feb 12, 2016

alexcrichton commented Feb 12, 2016

ranma42 commented Feb 13, 2016

jayaddison commented Mar 27, 2023

dotdash commented Mar 27, 2023

jayaddison commented Mar 27, 2023

jayaddison commented Mar 31, 2023