[RFC] Try to ensure we don’t emit duplicate symbols #22811

nagisa · 2015-02-25T19:09:42Z

Ok, this was a hard one, and the description will also be somewhat long. We provide tools to tell what exact symbols to emit for any fn or static, but don’t quite check if that won’t cause any issues later on. Some of the issues include LLVM mangling our names again and our names pointing to wrong locations, us generating dumb foreign call wrappers, linker errors, extern functions resolving to different symbols altogether (extern {fn fail();} fail(); in some cases calling fail1()), etc.

Before the commit we had a function called note_unique_llvm_symbol, so it is clear somebody was aware of the issue at some point, but the function was barely used, mostly in irrelevant locations.

I published this PR mainly as RFC, because I’m not sure about approach taken here, and want to hear people’s involved with trans thoughts in general.

P.S. This is [breaking-change] because it makes dumbly written code (such as issue-15562.rs test) properly invalid.
P.S.S. this in its current form fixes all those issues about #[no_mangle] on trait methods.
P.S.S.S. were this PR received positively, I’d like to clean it up some before merge.

rust-highfive · 2015-02-25T19:09:48Z

r? @pcwalton

(rust_highfive has picked a reviewer for you, use r? to override)

huonw · 2015-02-25T23:45:00Z

I vaguely recall parallel codegen was broken due to duplicate symbols or something, maybe this helps with that? (cc @nick29581)

(Sorry for vagueness.)

nrc · 2015-02-26T01:15:24Z

cc @epdtry - was that the problem with parallel codegen? Does this help?

nagisa · 2015-02-26T07:59:01Z

It probably does not, since the compiler still can pass the checks for the same symbol multiple times and then register all of them simultaneously. It would need locking in trans to fix that, AFAICT.

nagisa · 2015-02-26T08:44:39Z

Hmm, probably checking for a symbol and registering one with LLVM at the same time (with a global lock, probably) and then using the registered ValueRef everywhere in trans would be a better choice overall. That would make parallel codegen correct as well at least wrt symbol clashes.

nagisa · 2015-02-26T22:22:59Z

OK, I think I got what the issue with parallel codegen was and it is, on the retrospective, very obvious. The thing is that we use codegen-units number of LLVM contexts during trans and those LLVM contexts are mostly isolated from each other during the trans and optimisation phases. Objects produced by those contexts are merged during linking which either fails or does nothing sensible in presence of duplicate symbols.

alexcrichton · 2015-02-27T01:59:20Z

I vaguely recall parallel codegen was broken due to duplicate symbols or something

The specific issue in question I believe was #18243, but I don't think that a small reproduction was ever generated. I do think, however, that -C codegen-units=N is broken today on master so this might enable it to work in stage1+. It also depends on what linker you're using I think... (fun times!)

Hmm, probably checking for a symbol and registering one with LLVM at the same time (with a global lock, probably) and then using the registered ValueRef everywhere in trans would be a better choice overall. That would make parallel codegen correct as well at least wrt symbol clashes.

Note that the parallel part of parallel codegen is only the LLVM passes. We continue to translate the entire AST serially, the strategy was just to round-robin translations into one of N LLVM contexts. Each context is then independently serialize.

Along those lines it should be possible to have a shared global context (I think this already exists?) which keeps track of symbol names with unsynchronized access (as all translation is done serially).

Objects produced by those contexts are merged during linking which either fails or does nothing sensible in presence of duplicate symbols.

I believe this is indeed the problem! In theory though we should never be producing duplicate symbols. If I remember correctly it was one of our own compiler-generated symbols, not a user-input symbols (so this may not help too too much).

I published this PR mainly as RFC, because I’m not sure about approach taken here, and want to hear people’s involved with trans thoughts in general.

I think this is a great idea! We've run into so may bugs in the past of duplicate names in codegen and they're always nasty and terrible to both find and fix.

I think the strategy taken here is relatively ok, but I would prefer to see the calls to functions like LLVMAddGlobal or register_fn to all get funneled through one common location. This one common location could then keep track of the symbols previously registered (and may pick up more instances here and there. Basically, if we're going to add this, I think we will want it to be as robust as possible which means pushing it as far down into the LLVM stack as possible.

nagisa · 2015-02-27T06:04:38Z

I would prefer to see the calls to functions like LLVMAddGlobal ... to all get funneled through one common location

Yeah, that's exactly the strategy I am pursuing since yesterday, now.
Will post a new patchset sometime soon.

Along those lines it should be possible to have a shared global context (I think this already exists?) which keeps track of symbol names with unsynchronized access (as all translation is done serially).

We have a shared context for metadata, will investigate whether that
can be used for keeping track of the symbols as well.

On 27 February 2015 at 03:59, Alex Crichton notifications@github.com wrote:

I vaguely recall parallel codegen was broken due to duplicate symbols or
something

The specific issue in question I believe was #18243, but I don't think that
a small reproduction was ever generated. I do think, however, that -C
codegen-units=N is broken today on master so this might enable it to work in
stage1+. It also depends on what linker you're using I think... (fun times!)

Hmm, probably checking for a symbol and registering one with LLVM at the
same time (with a global lock, probably) and then using the registered
ValueRef everywhere in trans would be a better choice overall. That would
make parallel codegen correct as well at least wrt symbol clashes.

Note that the parallel part of parallel codegen is only the LLVM passes. We
continue to translate the entire AST serially, the strategy was just to
round-robin translations into one of N LLVM contexts. Each context is then
independently serialize.

Along those lines it should be possible to have a shared global context (I
think this already exists?) which keeps track of symbol names with
unsynchronized access (as all translation is done serially).

Objects produced by those contexts are merged during linking which either
fails or does nothing sensible in presence of duplicate symbols.

I believe this is indeed the problem! In theory though we should never be
producing duplicate symbols. If I remember correctly it was one of our own
compiler-generated symbols, not a user-input symbols (so this may not help
too too much).

I published this PR mainly as RFC, because I’m not sure about approach taken
here, and want to hear people’s involved with trans thoughts in general.

I think this is a great idea! We've run into so may bugs in the past of
duplicate names in codegen and they're always nasty and terrible to both
find and fix.

I think the strategy taken here is relatively ok, but I would prefer to see
the calls to functions like LLVMAddGlobal or register_fn to all get funneled
through one common location. This one common location could then keep track
of the symbols previously registered (and may pick up more instances here
and there. Basically, if we're going to add this, I think we will want it to
be as robust as possible which means pushing it as far down into the LLVM
stack as possible.

—
Reply to this email directly or view it on GitHub.

Regards,
Simonas Kazlauskas

bors · 2015-03-03T17:01:50Z

☔ The latest upstream changes (presumably #22532) made this pull request unmergeable. Please resolve the merge conflicts.

nagisa · 2015-03-03T17:02:49Z

@bors not like I care 😒

@alexcrichton

We provide tools to tell what exact symbols to emit for any fn or static, but don’t quite check if that won’t cause any issues later on. Some of the issues include LLVM mangling our names again and our names pointing to wrong locations, us generating dumb foreign call wrappers, linker errors, extern functions resolving to different symbols altogether (`extern {fn fail();} fail();` in some cases calling `fail1()`), etc. Before the commit we had a function called `note_unique_llvm_symbol`, so it is clear somebody was aware of the issue at some point, but the function was barely used, mostly in irrelevant locations. Along with working on it I took liberty to start refactoring trans/base into a few smaller modules. The refactoring is incomplete and I hope I will find some motivation to carry on with it. This is possibly a [breaking-change] because it makes dumbly written code properly invalid. This fixes all those issues about incorrect use of #[no_mangle] being not reported/misreported/ICEd by the compiler. NB. This PR does not attempt to tackle the parallel codegen issue that was mentioned in #22811, but I believe it should be very straightforward in a follow up PR by modifying `trans::declare::get_defined_value` to look at all the contexts. cc @alexcrichton @huonw @nrc because you commented on the original RFC issue. EDIT: wow, this became much bigger than I initially intended.

nagisa added 4 commits February 25, 2015 01:37

Validate export_name attribute syntax

6044f87

Properly track functions added to LLVM

d25c136

Error when symbols clash in some scenarios

33c03c5

Add tests for duplicate symbol errors

47d0768

rust-highfive assigned pcwalton Feb 25, 2015

nagisa mentioned this pull request Mar 3, 2015

The war of symbol and symbol (the PR dedicated to duplicate symbols breaking rustc in unexpected ways) #23011

Merged

nagisa closed this Mar 3, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Try to ensure we don’t emit duplicate symbols #22811

[RFC] Try to ensure we don’t emit duplicate symbols #22811

nagisa commented Feb 25, 2015

rust-highfive commented Feb 25, 2015

huonw commented Feb 25, 2015

nrc commented Feb 26, 2015

nagisa commented Feb 26, 2015

nagisa commented Feb 26, 2015

nagisa commented Feb 26, 2015

alexcrichton commented Feb 27, 2015

nagisa commented Feb 27, 2015

bors commented Mar 3, 2015

nagisa commented Mar 3, 2015

[RFC] Try to ensure we don’t emit duplicate symbols #22811

[RFC] Try to ensure we don’t emit duplicate symbols #22811

Conversation

nagisa commented Feb 25, 2015

rust-highfive commented Feb 25, 2015

huonw commented Feb 25, 2015

nrc commented Feb 26, 2015

nagisa commented Feb 26, 2015

nagisa commented Feb 26, 2015

nagisa commented Feb 26, 2015

alexcrichton commented Feb 27, 2015

nagisa commented Feb 27, 2015

bors commented Mar 3, 2015

nagisa commented Mar 3, 2015