Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dump results of analysis phase as CSV (DXR support in Rust) #13222

Merged
merged 1 commit into from
Jun 13, 2014

Conversation

nrc
Copy link
Member

@nrc nrc commented Mar 31, 2014

Adds a -Z flag save-analysis which runs after the analysis phase of the compiler and saves a bunch of info into a CSV file for the crate. This is designed to work with the DXR code browser, but is frontend-independent, that is this info should be useful for all kinds of code browsers, IDEs, or other tools.

I need to squash commits before landing (there will probably be a fair few to come), please ignore them for now and just comment on the changes.

// option. This file may not be copied, modified, or distributed
// except according to those terms.

// Output a CSV file containing the output from rustc's analysis. The data is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a doc comment with //!.

@huonw
Copy link
Member

huonw commented Mar 31, 2014

(Do you happen to have a DXR instance displaying the output of this?)

@nrc
Copy link
Member Author

nrc commented Mar 31, 2014

@huonw I'm working on it :-) Need to land the changes to DXR and then set up the instance, hopefully later this week, all being well. I hope I can get it up soon so you can appreciate the joy this patch will bring :-)

@alexcrichton
Copy link
Member

It seems unfortunate to bake this support directly in to the compiler itself. Most usage of rustc won't be looking for DXR support.

A suggestion by @thestinger on IRC was to put this support in a separate tool inside of the rust repo if maintenance outside of the repo is too difficult.

@nrc
Copy link
Member Author

nrc commented Mar 31, 2014

I fear I have been selling this badly - please think of this not as DXR support, but just like dumping the JSON representation of the AST, but with semantic info too (and in CSV format, not JSON). There is nothing actually DXR-specific here, and if there is by mistake, then I should change it. DXR is just one way to use this info.

That said, I wouldn't mind moving this into a different crate if you think that is a better place for it.

@nrc nrc changed the title DXR support in Rust Dump results of analysis phase as CSV (DXR support in Rust) Mar 31, 2014
@brson
Copy link
Contributor

brson commented Apr 1, 2014

What relationship does this have to the analysis already included in the crate metadata?

@nikomatsakis
Copy link
Contributor

I'd love to see this support land somewhere. I don't have a strong opinion about whether it lands in rustc or as an external tool. In principle, it'd be nice if rustc included a nice API for accessing this sort of data in a reliable way -- but that's a long way off. I guess having semantic tools like dxr and rustdoc in tree at least gives us a clear idea of what the requirements are.

@nrc
Copy link
Member Author

nrc commented Apr 1, 2014

@brson - there is some overlap - if I knew then what I know now I would have leveraged some of the metadata code for this. But it is not duplication, in particular this info contains info about expressions in function bodies which metadata misses and also has more detail about intermediate forms, e.g., in foo::bar::baz, metadata only contains info about baz, whereas here we have info on all three idents (although not as much as I would like).

@nikomatsakis Just to clarify, most of DXR is out of tree (in the DXR repo), what is here is just a dump of analysis info in CSV format.

@brson
Copy link
Contributor

brson commented Apr 1, 2014

This type of thing is fine with me I think as long as it doesn't impose much maintenance burden.

@brson
Copy link
Contributor

brson commented Apr 12, 2014

Let's move this forward. It's very self-contained. No big deal.

@nick29581 I agree with Huon about the doc comment. Can you convert it? The issue with match is very subjective.

With those 'rebase' commits I'm guessing the intermediate commits are broken. The history here looks a bit messy anyway so can you rewrite it into one or a few?

// Return all non-empty prefixes of a path.
// For each prefix, we return the span for the last segment in the prefix and
// a str representation of the entire prefix.
fn process_path_prefixes(&self, path: &ast::Path) -> ~[(Span, ~str)] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These ~[] should all be Vec.


match fs::mkdir_recursive(&root_path, io::UserRWX) {
Err(_) => {
sess.bug(format!("Could not create directory {}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just an error (not a bug), since it's almost certainly a problem on the users side. Also, this should print the error too

Err(e) => sess.err(format!("Could not create directory {}: {}",
                           root_path.display(), e).as_slice())

@huonw
Copy link
Member

huonw commented May 25, 2014

I'm in favour of this, and I think the patch can be landed (modulo a few final nits) and then iteratively improved/extended/refactored if necessary, especially since it's so self-contained. However, it's quite large so someone else might have another suggestion for how to handle it.

I would be super-interested in seeing a regularly updated version of this hosted somewhere (part of the nightly build? I guess that adds yet another dependency (dxr) to the whole process. :( ). Presumably that would be assisted by targets in the makefile too. E.g. imitating make check like

make source-analysis
make source-analysis-stage2
make source-analysis-stage2-std
make source-analysis-stage2-rustc
# etc.

(seems fine to postpone.)

@nrc
Copy link
Member Author

nrc commented May 27, 2014

@huonw the plan is the Mozilla (web eng) will host an index of Rust and Servo which will be re-indexed daily. Going to take another few weeks though. Do you also want to be able to download the raw data? We could probably arrange that.

Given that indexing is pretty much orthogonal to the make target, is there any advantage to having a make target for this as well as a -Z flag?

@nrc
Copy link
Member Author

nrc commented May 27, 2014

Rebased and addressed @huonw 's comments.

@huonw, @brson r? (@brson - @huonw wanted a second opinion on landing this, see #13222 (comment))

@huonw
Copy link
Member

huonw commented May 27, 2014

Awesome!

I was thinking the make target would just whatever the rules are for re-indexing, i.e. running rustc -Z save-analysis on each crate. This means that the rule naturally tracks compiler changes & library additions/removals (because the Makefiles have full knowledge of all crates) and that others can easily get the analysis if they wish.

(I personally have no particular use for the raw data, but I imagine some people might. shrug)

r=me assuming there's agreement for landing this.

@nrc
Copy link
Member Author

nrc commented May 27, 2014

@huonw couldn't you just use RUSTFLAGS env var for that? For the DXR index run, I just do a normal make with RUSTFLAGS_STAGE2 += -Zsave-analysis

@huonw
Copy link
Member

huonw commented May 27, 2014

Oh, I guess so... seems a little bit of a hack, but I suppose it works fine.

@nrc
Copy link
Member Author

nrc commented Jun 1, 2014

I don't know what this means: recipe for target `i686-pc-mingw32/stage1/bin/rustlib/i686-pc-mingw32/lib/stamp.rustc' failed

There are no errors I can see in the output, so I don't see why this is failing - is this just Windows build stupidity?

@huonw
Copy link
Member

huonw commented Jun 1, 2014

Weird. Let's just retry...

@klutzy
Copy link
Contributor

klutzy commented Jun 1, 2014

(my win32 machine compiled this pull request well)

@alexcrichton
Copy link
Member

An error like that normally means that the windows bots segfaulted while building. If you push to try you'll get a similar environment which should have the same results, and I can help you see what the stack trace is at the time of the segfault.

@alexcrichton
Copy link
Member

It looks like #14604 has suffered a similar build problem which I am currently investigating. It looks like it also requires rustc itself to be built without optimizations.

@alexcrichton
Copy link
Member

I looked into the failure cause of #14604, and it appears to fail because of OOM. Perhaps this patch is triggering a large number of allocations?

@alexcrichton
Copy link
Member

Note that I have un-gated the builder that this PR failed on (#14791), so a rebase will likely allow this to land.

Adds the option -Zsave-analysis which will dump the results of syntax and type checking into CSV files. These can be interpreted by tools such as DXR to provide semantic information about Rust programs for code search, cross-reference, etc.

Authored by Nick Cameron and Peter Elmers (@pelmers; including enums, type parameters/generics).
bors added a commit that referenced this pull request Jun 13, 2014
Adds a -Z flag `save-analysis` which runs after the analysis phase of the compiler and saves a bunch of info into a CSV file for the crate. This is designed to work with the DXR code browser, but is frontend-independent, that is this info should be useful for all kinds of code browsers, IDEs, or other tools.

I need to squash commits before landing (there will probably be a fair few to come), please ignore them for now and just comment on the changes.
@bors bors merged commit 984e9af into rust-lang:master Jun 13, 2014
@pelmers
Copy link
Contributor

pelmers commented Jun 13, 2014

\o/ impressive work!

notriddle pushed a commit to notriddle/rust that referenced this pull request Sep 20, 2022
Remove redundant 'resolve_obligations_as_possible' call

Hi! I was looking for a "good first issue" and saw this one: rust-lang/rust-analyzer#7542. I like searching for performance improvements, so I wanted to try to find something useful there.

There are two tests in integrated_benchmarks.rs, I looked at 'integrated_highlighting_benchmark' (not the one discussed in the issue above).

Profile from that test looks like this:
```
$ RUN_SLOW_BENCHES=1 cargo test --release --package rust-analyzer --lib -- integrated_benchmarks::integrated_highlighting_benchmark --exact --nocapture
    Finished release [optimized] target(s) in 0.06s
     Running unittests src/lib.rs (target/release/deps/rust_analyzer-a80ca6bb8f877458)

running 1 test
workspace loading: 358.45ms
initial: 9.60s
change: 13.96µs
cpu profiling is disabled, uncomment `default = [ "cpu_profiler" ]` in Cargo.toml to enable.
  273ms - highlight
      143ms - infer:wait @ per_query_memory_usage
          143ms - infer_query
                0   - crate_def_map:wait (3165 calls)
                4ms - deref_by_trait (967 calls)
               96ms - resolve_obligations_as_possible (22106 calls)
                0   - trait_solve::wait (2068 calls)
       21ms - Semantics::analyze_impl (18 calls)
        0   - SourceBinder::to_module_def (20 calls)
       36ms - classify_name (19 calls)
       19ms - classify_name_ref (308 calls)
        0   - crate_def_map:wait (461 calls)
        4ms - descend_into_macros (628 calls)
        0   - generic_params_query (4 calls)
        0   - impl_data_with_diagnostics_query (1 calls)
       45ms - infer:wait (37 calls)
        0   - resolve_obligations_as_possible (2 calls)
        0   - source_file_to_def (1 calls)
        0   - trait_solve::wait (42 calls)
after change: 275.23ms
test integrated_benchmarks::integrated_highlighting_benchmark ... ok
```
22106 calls to `resolve_obligations_as_possible` seem like the main issue there.

One thing I noticed (and fixed in this PR) is that `InferenceContext::resolve_ty_shallow` first calls `resolve_obligations_as_possible`, and then calls `InferenceTable::resolve_ty_shallow`. But `InferenceTable::resolve_ty_shallow` [inside](https://github.com/rust-lang/rust-analyzer/blob/2e9f1204ca01c3e20898d4a67c8b84899d394a88/crates/hir-ty/src/infer/unify.rs#L372) again calls `resolve_obligations_as_possible`.

`resolve_obligations_as_possible` inside has a while loop, which works until it can't find any helpful information. So calling this function for the second time does nothing, so one of the calls could be safely removed.

`InferenceContext::resolve_ty_shallow` is actually quite a hot place, and after fixing it, the total number of `resolve_obligations_as_possible` in this test is reduced to 15516 (from 22106). "After change" time also improves from ~270ms to ~240ms, which is not a very huge win, but still something measurable.

Same profile after PR:
```
$ RUN_SLOW_BENCHES=1 cargo test --release --package rust-analyzer --lib -- integrated_benchmarks::integrated_highlighting_benchmark --exact --nocapture
    Finished release [optimized] target(s) in 0.06s
     Running unittests src/lib.rs (target/release/deps/rust_analyzer-a80ca6bb8f877458)

running 1 test
workspace loading: 339.86ms
initial: 9.28s
change: 10.69µs
cpu profiling is disabled, uncomment `default = [ "cpu_profiler" ]` in Cargo.toml to enable.
  236ms - highlight
      110ms - infer:wait @ per_query_memory_usage
          110ms - infer_query
                0   - crate_def_map:wait (3165 calls)
                4ms - deref_by_trait (967 calls)
               64ms - resolve_obligations_as_possible (15516 calls)
                0   - trait_solve::wait (2068 calls)
       21ms - Semantics::analyze_impl (18 calls)
        0   - SourceBinder::to_module_def (20 calls)
       34ms - classify_name (19 calls)
       18ms - classify_name_ref (308 calls)
        0   - crate_def_map:wait (461 calls)
        3ms - descend_into_macros (628 calls)
        0   - generic_params_query (4 calls)
        0   - impl_data_with_diagnostics_query (1 calls)
       45ms - infer:wait (37 calls)
        0   - resolve_obligations_as_possible (2 calls)
        0   - source_file_to_def (1 calls)
        0   - trait_solve::wait (42 calls)
after change: 238.15ms
test integrated_benchmarks::integrated_highlighting_benchmark ... ok
```

The performance of this test could be further improved but at the cost of making code more complicated, so I wanted to check if such a change is desirable before sending another PR.

`resolve_obligations_as_possible` is actually called a lot of times even when no new information was provided. As I understand, `resolve_obligations_as_possible` could do something useful only if some variables/values were unified since the last check. We can store a boolean variable inside `InferenceTable`, which indicates if `try_unify` was called after last `resolve_obligations_as_possible`. If it wasn't called, we can safely not call `resolve_obligations_as_possible` again.

I tested this change locally, and it reduces the number of `resolve_obligations_as_possible` to several thousand (it is not shown in the profile anymore, so don't know the exact number), and the total time is reduced to ~180ms. Here is a generated profile:
```
$ RUN_SLOW_BENCHES=1 cargo test --release --package rust-analyzer --lib -- integrated_benchmarks::integrated_highlighting_benchmark --exact --nocapture
    Finished release [optimized] target(s) in 0.06s
     Running unittests src/lib.rs (target/release/deps/rust_analyzer-a80ca6bb8f877458)

running 1 test
workspace loading: 349.92ms
initial: 8.56s
change: 11.32µs
cpu profiling is disabled, uncomment `default = [ "cpu_profiler" ]` in Cargo.toml to enable.
  175ms - highlight
       21ms - Semantics::analyze_impl (18 calls)
        0   - SourceBinder::to_module_def (20 calls)
       33ms - classify_name (19 calls)
       17ms - classify_name_ref (308 calls)
        0   - crate_def_map:wait (461 calls)
        3ms - descend_into_macros (628 calls)
        0   - generic_params_query (4 calls)
        0   - impl_data_with_diagnostics_query (1 calls)
       97ms - infer:wait (38 calls)
        0   - resolve_obligations_as_possible (2 calls)
        0   - source_file_to_def (1 calls)
        0   - trait_solve::wait (42 calls)
after change: 177.04ms
test integrated_benchmarks::integrated_highlighting_benchmark ... ok
```
Let me know if adding a new bool field seems like a reasonable tradeoff, so I can send a PR.
flip1995 pushed a commit to flip1995/rust that referenced this pull request Aug 8, 2024
…1995

Use a deterministic number of digits in rustc_tools_util commit hashes

Using `git rev-parse --short` in rustc_tools_util causes nondeterministic compilation of projects that use `setup_version_info!` and `get_version_info!` when built from the exact same source code and git commit. The number of digits printed by `--short` is sensitive to how many other branches and tags in the repository have been fetched so far, what other commits have been worked on in other branches, how recently you had run `git gc`, platform-specific variation in git's default configuration, and platform differences in the sequence of steps performed by the release pipeline. Someone can compile a tool from a particular commit, switch branches to work on a different commit (or simply do a git fetch), go back to the first commit and be unable to reproduce the binary that was built from it previously.

Currently, variation in short commit hashes causes Clippy version strings to be out of sync between different targets. On x86_64-unknown-linux-gnu:

```console
$ clippy-driver +1.80.0 --version
clippy 0.1.80 (0514789 2024-07-21)
```

Whereas on aarch64-apple-darwin:

```console
$ clippy-driver +1.80.0 --version
clippy 0.1.80 (0514789 2024-07-21)
```

---

changelog: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants