Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rlibs are not bit-identical when building the same crate in different directories with -Zremap-path-prefix #48019

Closed
luser opened this issue Feb 5, 2018 · 7 comments · Fixed by #48641
Labels
A-metadata Area: Crate metadata C-enhancement Category: An issue proposing an enhancement or a PR with one.

Comments

@luser
Copy link
Contributor

luser commented Feb 5, 2018

I was trying to determine if sccache could cache crates built as dependencies of different root crates. When I looked into it I found that the rlibs were not bit-identical despite being built from the same source. Most of the differences were in the debug info because the current directory winds up encoded there. I tried using -Zremap-path-prefix-from=... -Zremap-path-prefix-to=... which removed most of the differences but not all. A small difference in the metadata remains, which from looking at the diff of the serialized metadata @eddyb said was the SVH. @michaelwoerister: do you know why this would be?

I built this test crate twice in different directories using the test-build.sh script. The diff of the itoa rlib from each build looks like this (I generated that using diffoscope.)

Presumably this would block #34902 and also #38322.

@TimNN TimNN added C-enhancement Category: An issue proposing an enhancement or a PR with one. A-metadata Area: Crate metadata labels Feb 6, 2018
@michaelwoerister
Copy link
Member

Lots of stuff goes into the SVH. It might well be that something directory related accidentally goes into it. The SVH is computed here:

pub(super) fn finalize_and_compute_crate_hash(self,
crate_disambiguator: CrateDisambiguator,
cstore: &CrateStore,
commandline_args_hash: u64)
-> (Vec<MapEntry<'hir>>, Svh) {
let mut node_hashes: Vec<_> = self
.hir_body_nodes
.iter()
.map(|&(def_path_hash, dep_node_index)| {
(def_path_hash, self.dep_graph.fingerprint_of(dep_node_index))
})
.collect();
node_hashes.sort_unstable_by(|&(ref d1, _), &(ref d2, _)| d1.cmp(d2));
let mut upstream_crates: Vec<_> = cstore.crates_untracked().iter().map(|&cnum| {
let name = cstore.crate_name_untracked(cnum).as_str();
let disambiguator = cstore.crate_disambiguator_untracked(cnum)
.to_fingerprint();
let hash = cstore.crate_hash_untracked(cnum);
(name, disambiguator, hash)
}).collect();
upstream_crates.sort_unstable_by(|&(name1, dis1, _), &(name2, dis2, _)| {
(name1, dis1).cmp(&(name2, dis2))
});
let (_, crate_dep_node_index) = self
.dep_graph
.with_task(DepNode::new_no_params(DepKind::Krate),
&self.hcx,
((node_hashes, upstream_crates),
(commandline_args_hash,
crate_disambiguator.to_fingerprint())),
identity_fn);
let svh = Svh::new(self.dep_graph
.fingerprint_of(crate_dep_node_index)
.to_smaller_hash());
(self.map, svh)
}

@infinity0
Copy link
Contributor

infinity0 commented Feb 12, 2018

cc #34902

This is similar to the GCC build-path issue where other programs (and in the past, GCC's DW_AT_producer) likes to embed CFLAGS into the output. Suggested options:

  1. strip out -Zremap-path-prefix away from that SVH calculation - but other programs that also embed RUSTFLAGS or similar, would also need similar logic
  2. support the BUILD_PATH_PREFIX_MAP envvar or something similar
  3. have cargo support a constant flag (whose value does not depend on build-path) that figures out the right -Zremap-path-prefix for rustc so they don't need to appear in RUSTFLAGS. you'd also need to implement (1) but at least the caveat (about other programs) doesn't apply.

@michaelwoerister
Copy link
Member

michaelwoerister commented Feb 12, 2018

#48162 addresses (1), which is kind of orthogonal to (2) and (3) because the SVH will always be part of the output binary and it must change if the result of the mapping changes.

kennytm added a commit to kennytm/rust that referenced this issue Feb 14, 2018
…omatsakis

Handle path prefix mapping in a more stable way when computing the crate hash

This hopefully fixes issue rust-lang#48019.

cc @luser @infinity0
@luser
Copy link
Contributor Author

luser commented Feb 20, 2018

With the latest Rust nightly:

$ rustc +nightly --version
rustc 1.25.0-nightly (27a046e93 2018-02-18)

which includes the fix from #48162, the outputs from my test script are still not the same:

$ diffoscope /tmp/snippet-build-{1,2}/debug/deps/libitoa-13dadf23f55f5c52.rlib
--- /tmp/snippet-build-1/debug/deps/libitoa-13dadf23f55f5c52.rlib
+++ /tmp/snippet-build-2/debug/deps/libitoa-13dadf23f55f5c52.rlib
├── file list
│ @@ -2,14 +2,14 @@
│  ----------   0        0        0        0 1970-01-01 00:00:00.000000 //
│  ?rw-r--r--   0        0        0     8888 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa0.rcgu.o
│  ?rw-r--r--   0        0        0     6072 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa1.rcgu.o
│  ?rw-r--r--   0        0        0     4168 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa2.rcgu.o
│  ?rw-r--r--   0        0        0    69312 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa3.rcgu.o
│  ?rw-r--r--   0        0        0     4368 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa4.rcgu.o
│  ?rw-r--r--   0        0        0     5936 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa5.rcgu.o
│ -?rw-r--r--   0        0        0    24170 1970-01-01 00:00:00.000000 rust.metadata.bin
│ +?rw-r--r--   0        0        0    24171 1970-01-01 00:00:00.000000 rust.metadata.bin
│  ?rw-r--r--   0        0        0     2578 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa0.rcgu.bc.z
│  ?rw-r--r--   0        0        0     3200 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa1.rcgu.bc.z
│  ?rw-r--r--   0        0        0     2236 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa2.rcgu.bc.z
│  ?rw-r--r--   0        0        0    17086 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa3.rcgu.bc.z
│  ?rw-r--r--   0        0        0     2348 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa4.rcgu.bc.z
│  ?rw-r--r--   0        0        0     2918 1970-01-01 00:00:00.000000 itoa-13dadf23f55f5c52.itoa5.rcgu.bc.z
├── rust.metadata.bin
│ @@ -1501,11 +1501,11 @@
│  00005dc0: 5854 0000 fe54 0000 1855 0000 c055 0000  XT...T...U...U..
│  00005dd0: da55 0000 8356 0000 9d56 0000 4657 0000  .U...V...V..FW..
│  00005de0: 6057 0000 0958 0000 2358 0000 cc58 0000  `W...X..#X...X..
│  00005df0: e658 0000 8f59 0000 a959 0000 525a 0000  .X...Y...Y..RZ..
│  00005e00: 6c5a 0000 155b 0000 2f5b 0000 d85b 0000  lZ...[../[...[..
│  00005e10: f25b 0000 9b5c 0000 0469 746f 6118 7838  .[...\...itoa.x8
│  00005e20: 365f 3634 2d75 6e6b 6e6f 776e 2d6c 696e  6_64-unknown-lin
│ -00005e30: 7578 2d67 6e75 d4e0 dc83 d683 979e 4011  ux-gnu........@.
│ -00005e40: 3466 e780 3d23 b56a e700 93e9 e23c dc00  4f..=#.j.....<..
│ -00005e50: 0000 0000 0ad2 bb01 0000 0000 02c0 0184  ................
│ -00005e60: 0302 b212 0a06 5adf a101                 ......Z...
│ +00005e30: 7578 2d67 6e75 81e8 88b6 97cf a0dd e301  ux-gnu..........
│ +00005e40: 1134 66e7 803d 23b5 6ae7 0093 e9e2 3cdc  .4f..=#.j.....<.
│ +00005e50: 0000 0000 000a d2bb 0100 0000 0002 c001  ................
│ +00005e60: 8403 02b2 120a 065a dfa1 01              .......Z...

@alexcrichton
Copy link
Member

I think we may still be hashing -L arguments?

$ rustc +nightly -L dependency=/bar1 /dev/null --crate-type rlib && sha1sum libnull.rlib
b7aae8f50e259300dd62275d8d1de55e1bc13cdb  libnull.rlib
$ rustc +nightly -L dependency=/bar2 /dev/null --crate-type rlib && sha1sum libnull.rlib     
f91e3c3a28aef26cc06ad789339822b22e0f5121  libnull.rlib

@michaelwoerister
Copy link
Member

Yes, it seems we do:

libs: Vec<(String, Option<String>, Option<cstore::NativeLibraryKind>)> [TRACKED],

Good catch! Not sure if we have to hash those at this point. The tracking system should catch resulting changes later on, if we don't have any holes in the tracking. This needs some review.

@jsgf
Copy link
Contributor

jsgf commented Feb 22, 2018

Edit: never mind - seems fixed now that I've rebased to master.

I found that (after implementing it) --remap-path-prefix still affects the SVH, even if the mapping does nothing:

rustc --remap-path-prefix=lhgkjsdhgf=uxyciuhxsnd --crate-type rlib foo.rs

will generate different metadata hashes if I change the mapping, even though it doesn't affect anything and nothing else changes.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Mar 1, 2018
Currently rustc isn't always the best at producing deterministic builds of a
crate when the source directory of a crate is changed. This is happening due to
what appears two different sources:

* First the `-L` paths passed to rustc are hashed into the crate hash. These
  paths through Cargo are typically absolute paths that can vary if the build
  directory changes.

* Next the paths passed to `--extern` are also hashed which like `-L` can change
  if the build directory changes.

This commit fixes these two sources of nondeterminism by ensuring that avoiding
tracking the hashes of these arguments on the command line. For `-L` paths
they're either related to loading crates (whose hashes are tracked elsewhere) or
native librarise used in the linking phase (which isn't incremental). The
`--extern` paths are similar in that they're related to crate resolution which
is already tracked independently of the command line arguments.

Closes rust-lang#48019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-metadata Area: Crate metadata C-enhancement Category: An issue proposing an enhancement or a PR with one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants