Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas on getting information about borrow stacks during execution #2322

Merged
merged 4 commits into from
Oct 19, 2022

Conversation

saethlin
Copy link
Member

@saethlin saethlin commented Jul 4, 2022

From time to time people ask what some borrow stack looks like in some code. I just know that I am terrible at doing Stacked Borrows by hand, so I always toss together something like this.

I know that Miri has logging, but I've never found it particularly useful because there's just too much output. Also I personally don't think about exactly what the state of a borrow stack is, but this seems to be something that newcomers to Stacked Borrows always want.

Update: This has been sitting as S-waiting-on-author for a long time. I bring it out from time to time to explain Stacked Borrows to people, and just now @JakobDegen said

Can we please merge that btw? It's such a valuable teaching tool
Interfaces can be fixed later

I'm inclined to trust Jake's judgement here.

@RalfJung RalfJung added S-blocked-on-rust Status: Blocked on landing a Rust PR and removed S-blocked-on-rust Status: Blocked on landing a Rust PR labels Jul 4, 2022
Comment on lines +372 to +432
"miri_print_stacks" => {
let [id] = this.check_shim(abi, Abi::Rust, link_name, args)?;
let id = this.read_scalar(id)?.to_u64()?;
if let Some(id) = std::num::NonZeroU64::new(id) {
this.print_stacks(AllocId(id))?;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this separate (and on an alloc id) so we can avoid adding more stacked borrows to things when printing the stack?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. As it is, you usually end up perturbing the stacked borrows runtime a bit to pass the pointer to miri_get_alloc_id, but at least you don't perturb it on every call.

Some people have suggested that I invent some magic Miri macro that can avoid this instead of extern functions, but I have no idea how to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no no, I like it. Should document that though, so people don't just miri_print_stacks(miri_get_alloc_id(&x.y)) everywhere

Copy link
Member Author

@saethlin saethlin Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but where?

This whole workflow where end users declare extern "Rust" functions is strange to say the least. Would it make sense to have something on crates.io that at least provides safe wrappers around these? Having a crate for these would definitely ease documenting them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have something on crates.io that at least provides safe wrappers around these? Having a crate for these would definitely ease documenting them.

yea, I was wondering about that, too. Such a crate is complicated to tie to a miri version, so I'm not sure publishing it will make things easier for people if we ever change anything in that crate. But it may be better than just having documentation in markdown for these magic extern functions. idk, maybe start out with docs in this repo? It's not like such a crate would be on crates.io or godbolt, so people may have to resort to locally using extern anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but where?

The README has a section on these magic extern functions; they need to at least be documented there.

And then we should have a chapter in the Miri Book about how to debug Stacked Borrows things. Which book, you ask? Yes that is indeed the point.^^

yea, I was wondering about that, too. Such a crate is complicated to tie to a miri version, so I'm not sure publishing it will make things easier for people if we ever change anything in that crate.

Yes we should have such a create! This interface does not change very often, and when it does we anyway need to do some clever updating thing like we did when changing the backtrace API. So we can add a new API to Miri, ship that, then update the crate, ship that, then maybe remove the old thing from Miri.

Which reminds me, all these magic functions that Miri exposes should have some kind of flag/version number parameter, to make them extensible in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation is still missing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow I missed this...

Do we want to add a version mangle to these new functions?

The functions in the README aren't in alphabetical order 😅 so I guess I'll just stick the new ones at the bottom?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add the version to everything in one go when we figure out how we want to distribute these functions to users

@RalfJung
Copy link
Member

RalfJung commented Jul 6, 2022

Could you show some output of this? I think it will be mostly meaningless, since those tag IDs are just numbers...

So, while I would love to make SB more "visible", I am not convinced just dumping the internal state is very useful. It would be amazing if the stack could be printed in terms of variable names (the variables that hold those pointers) rather than IDs, but that seems nearly impossible to do. Also Priroda could be a great tool for exploring how borrow stacks evolve.

@saethlin
Copy link
Member Author

saethlin commented Jul 6, 2022

Someone had a specific question about what the borrow stacks in this program looked like:

let mut x = 0i32;
let uniq = &mut x;
let rbrw1 = &mut *uniq;
let rbrw2 = &mut *rbrw1;
println!("{}", uniq);

So I coded up this:

extern "Rust" {
    fn miri_get_alloc_id(_: *const ()) -> u64;
    fn miri_print_stacks(_: u64);
}

fn main() {
    let mut x = 0i32;
    let id = unsafe { miri_get_alloc_id(&x as *const i32 as *const ()) };
    unsafe { miri_print_stacks(id) };
    let uniq = &mut x;
    unsafe { miri_print_stacks(id) };
    let rbrw1 = &mut *uniq;
    unsafe { miri_print_stacks(id) };
    let rbrw2 = &mut *rbrw1;
    unsafe { miri_print_stacks(id) };
    println!("{}", uniq);
    unsafe { miri_print_stacks(id) };
}

Which prints this:

0..4: [Unique for <3231>] [SharedReadOnly for <3232>] [SharedReadOnly for <3233>]
0..4: [Unique for <3231>] [Unique for <3234>]
0..4: [Unique for <3231>] [Unique for <3234>] [Unique for <3235>]
0..4: [Unique for <3231>] [Unique for <3234>] [Unique for <3235>] [Unique for <3236>]
0
0..4: [Unique for <3231>] [Unique for <3234>] [Disabled for <3235>] [Disabled for <3236>] [SharedReadOnly for <4360>] [SharedReadOnly for <4361>] [SharedReadOnly for <4365> (call 1235)] [SharedReadOnly for <4367>] [SharedReadOnly for <4370> (call 1236)]

So yes I think the numbers are mostly meaningless, but if the problem with visual noise is the tag numbers, we could just print the permissions. Or normalize to whatever the next tag ID is upon entering main.

I don't know about understanding what the tags are, but previously I had implemented another variation on this which did printing upon retag, not when the user asks for it. The advantage to that is that I could print the local span along with type info from the retag. Its looked like this:

        let x = &mut [0, 1, 2];
        miri_begin_dump();
        x.len();
        &mut x[0];
        miri_end_dump();
note: currently here
  --> src/main.rs:12:9
   |
12 |         x.len();
   |         ^^^^^^^

[i32; 3]: 
    0..12: [[Unique for <2392>], [Unique for <2393>], [SharedReadOnly for <2394>]]
[i32]: 
    0..12: [[Unique for <2392>], [Unique for <2393>], [SharedReadOnly for <2394>], [SharedReadOnly for <2395>]]
usize: 
    0..8: [[Unique for <2396>], [Unique for <2397> (call 770)]]
[i32]: 
    0..12: [[Unique for <2392>], [Unique for <2393>], [SharedReadOnly for <2394>], [SharedReadOnly for <2395>], [SharedReadOnly for <2398> (call 770)]]
[i32]: 
    0..12: [[Unique for <2392>], [Unique for <2393>], [SharedReadOnly for <2394>], [SharedReadOnly for <2395>], [SharedReadOnly for <2398> (call 770)], [SharedReadOnly for <untagged>]]
note: currently here
  --> src/main.rs:13:9
   |
13 |         &mut x[0];
   |         ^^^^^^^^^

i32: 
    0..4: [[Unique for <2392>], [Unique for <2393>], [Unique for <2400>]]
    4..12: [[Unique for <2392>], [Unique for <2393>], [SharedReadOnly for <2394>], [SharedReadOnly for <2395>], [SharedReadOnly for <2398> (call 770)], [SharedReadOnly for <untagged>]]

@saethlin
Copy link
Member Author

saethlin commented Jul 6, 2022

I think there are two use cases for such a tool:

  1. People learning about Stacked Borrows, who are trying to do it in their head for simple programs and want to see if they're doing it correctly.
  2. People looking at a program which Miri does not reject and they feel like Miri should reject. When Miri rejects a program, people tend to figure things out pretty quickly from the diagnostics, but when you don't get an error it's hard to understand much of anything.

@saethlin
Copy link
Member Author

saethlin commented Jul 6, 2022

Also while I agree that the numbers aren't very good, people type out messages into chat with the syntax above already, numbers and all. Even in the absence of a tool like this, people are using its nomenclature and syntax, probably because you have elsewhere.

Also this is a draft because I have many of the same reservations that you do. I don't want to sound like I'm arguing, perhaps the behavior of newcomers defies both our expectations. On the one hand I really don't want people to be doing Stacked Borrows in their head in other to check newcomers doing Stacked Borrows in their heads when we could have Miri just tell them what the answer is, if newcomers want to see if they have it right. But also I would like people to not type or messages in chat with the above syntax for a borrow stack. Surely there is a better way to learn.

@RalfJung
Copy link
Member

RalfJung commented Jul 6, 2022

Ah, that's actually not as bad as I thought. You can basically tell which ptr has which ID by seeing when they are added.

I have been doing Stacked Borrows in my head to explain why something doesn't throw an error, and I hated it, so I'm all for ways of making that simpler.^^

@RalfJung RalfJung added the S-waiting-on-author Status: Waiting for the PR author to address review comments label Jul 11, 2022
@bors
Copy link
Contributor

bors commented Jul 14, 2022

☔ The latest upstream changes (presumably #2369) made this pull request unmergeable. Please resolve the merge conflicts.

@T-Dark0
Copy link

T-Dark0 commented Jul 22, 2022

Hmm, I was just linked to this PR after a discussion in the community discord, and I have some questions about the formatting

  • Why is every item in the borrow stacks enclosed in its own pair of square parens? In other words, why is it [Unique for <3231>] [SharedReadOnly for <3232>] [SharedReadOnly for <3233>] and not, say, [Unique for <3231>, SharedReadOnly for <3232>, SharedReadOnly for <3233>]? When I first saw the format, I thought all of those items were single-item stacks
  • I feel like the syntax could be made less verbose by using something like SharedReadOnly(3232) or SharedReadOnly: 3232. This is probably outside the scope of this PR, but the for and the angle brackets are quite long.
  • It would definitely be nice if the numeric tags could be replaced by the expression that created them. That, or if at least there was a "glossary" of tags to expressions
  • The fact calling miri_get_alloc_id pushes two SROs is understandable, but annoying. The idea of wrapping miri's magic symbols into a crate has been suggested earlier. If this is done, could it "synchronize" with miri_get_alloc_id's implementation, so that the method has the side effect of popping two items from the relevant borrow stacks? (the idea being that the crate would call it by pushing two items for the &foo as *const _ as *const (), which would then be immediately removed)

@RalfJung
Copy link
Member

I'm open to adjusting how items are printed (this was entirely ad-hoc).

However, I am not happy about the idea of having some magic function mess with the structure of the borrow stacks, by just popping the topmost two items... that sounds rather fragile.

@saethlin saethlin force-pushed the stack-inspection-tools branch 2 times, most recently from 30c1b56 to 2f05c46 Compare July 22, 2022 21:39
@saethlin
Copy link
Member Author

saethlin commented Aug 7, 2022

@Darksonn is currently asking me to do SB in my head so here you go

@oli-obk
Copy link
Contributor

oli-obk commented Aug 29, 2022

idk how useful this is considering we'd need to heavily filter ids out of it, but could you add a test to see how problematic it is to test these functions?

@saethlin
Copy link
Member Author

Hm it occurs to me that this might look strange in the presence of a tag GC... if nothing else we should very clearly advise people to turn the GC off.

@saethlin saethlin added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Oct 14, 2022
@saethlin saethlin marked this pull request as ready for review October 14, 2022 23:57
@JakobDegen
Copy link
Contributor

Discussed on discord briefly.

Fwiw I would still like to see this merged based on the fact that it works right now and it's often really useful. If it has to go away later because it's incompatible with something that's fine.

That being said, this is probably not the right long term solution. Ben briefly brought up the idea of a debugger (maybe with a vaguely gdb-like interface), which sounds like it has much fewer problems and much more opportunities. It's a lot of work though, and no idea if anyone has the time. All the more reasons to start up T-opsem and a WG-ucg-teachability to get people interested...

@RalfJung
Copy link
Member

added S-waiting-on-review and removed S-waiting-on-author

marked this pull request as ready for review 6 hours ago

These things don't generate notifications, so please don't expect me to notice them -- please always write a comment as well. (Or just use rustbot ready.)

Also CI is not happy.

@oli-obk
Copy link
Contributor

oli-obk commented Oct 15, 2022

a debugger (maybe with a vaguely gdb-like interface)

Maybe we need to revive priroda. If we can compile it to wasm we can make it a playground with no server needed

@saethlin
Copy link
Member Author

please don't expect me to notice them

I don't :)
Perhaps I'm too cautious about sending you notifications. For the most part I figure people will get to my PRs/issues whenever they feel like it, and if I want attention I'll mention people.

CI isn't happy because I didn't heed Oli's warning 😂

I agree priroda might be a good thing to get rolling again. Would it be a lot of work?

@RalfJung
Copy link
Member

RalfJung commented Oct 16, 2022

For the most part I figure people will get to my PRs/issues whenever they feel like it, and if I want attention I'll mention people.

I won't usually get to any PRs that don't have a pending notification in my inbox. The notifications are my 'work queue', and notifications are basically the only process to have things enter that queue. So it's either send some kind of notification that the PR is ready, or just never have me even notice it except by coincidence.

You can say in the notification that this is low-priority and then I will treat it like that. :)

@oli-obk
Copy link
Contributor

oli-obk commented Oct 18, 2022

I agree priroda might be a good thing to get rolling again. Would it be a lot of work?

I haven't checked, but I think a clean rewrite would be best... but first I want to try running miri in wasm. The requirement for a server is a real downside, along with it breaking all the time. Maybe if I come up with a clean new version we can put it into the miri repo to keep it from breaking. I'll open an issue about it so we don't have to stay off topic here

@oli-obk
Copy link
Contributor

oli-obk commented Oct 19, 2022

@bors r+

@bors
Copy link
Contributor

bors commented Oct 19, 2022

📌 Commit 3d83f53 has been approved by oli-obk

It is now in the queue for this repository.

@bors
Copy link
Contributor

bors commented Oct 19, 2022

⌛ Testing commit 3d83f53 with merge da29c7a...

@bors
Copy link
Contributor

bors commented Oct 19, 2022

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing da29c7a to master...

@bors bors merged commit da29c7a into rust-lang:master Oct 19, 2022
@bors bors mentioned this pull request Oct 19, 2022
/// borrow stacks in an allocation. The format of what this emits is unstable and may change at any time.
/// In particular, users should be aware that Miri will periodically attempt to garbage collect the
/// contents of all stacks. Callers of this function may wish to pass `-Zmiri-tag-gc=0` to disable the GC.
fn miri_print_stacks(alloc_id: u64);
Copy link
Member

@RalfJung RalfJung Oct 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should rename this function. 'stacks' sounds like it is referring to the call stacks.

We should also make it clear that this is extremely unstable -- not only the format can change, the entire function can be removed from Miri any time or have its signature changed.


/// Miri-provided extern function to get the internal unique identifier for the allocation that a pointer
/// points to. This is only useful as an input to `miri_print_stacks`, and it is a separate call because
/// getting a pointer to an allocation at runtime can change the borrow stacks in the allocation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation should describe what happens when the ptr does not have an AllocId. provenance could be None or Wildcard, what happens then?

@@ -0,0 +1,5 @@
0..1: [ SharedReadWrite<TAG> ]
Copy link
Member

@RalfJung RalfJung Oct 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a comment that left is the bottom of the stack? Hm, but getting this each time is annoying... but the miri_print_stacks docs should say that, at least.

Also please add a test that involves an unknown bottom, to see how that prints.

bors added a commit that referenced this pull request Oct 26, 2022
Improve miri_print_borrow_stacks

Per post-merge review on #2322

* `miri_print_stacks` renamed to `miri_print_borrow_stacks`
* A bit more details in docs, clarified how unstable these functions are meant to be
* Print an `unknown_bottom` if one exists

Open question: Currently `miri_get_alloc_id` gets the expected `AllocId` for `Wildcard` pointers, but for pointers with no provenance, the function reports UB and halts the interpreter. That's definitely wrong. But what _should_ we do? Is it reasonable to check if the pointer has `None` provenance and try to get an `AllocId` for its address? That still leaves us with a failure path, which in this case might be best-handled as an ICE? I'm just not sure that changing the return type of `miri_get_alloc_id` to `Option` is a win because it complicates all normal uses of this.
RalfJung pushed a commit to RalfJung/rust that referenced this pull request Oct 26, 2022
…fJung

Improve miri_print_borrow_stacks

Per post-merge review on rust-lang/miri#2322

* `miri_print_stacks` renamed to `miri_print_borrow_stacks`
* A bit more details in docs, clarified how unstable these functions are meant to be
* Print an `unknown_bottom` if one exists

Open question: Currently `miri_get_alloc_id` gets the expected `AllocId` for `Wildcard` pointers, but for pointers with no provenance, the function reports UB and halts the interpreter. That's definitely wrong. But what _should_ we do? Is it reasonable to check if the pointer has `None` provenance and try to get an `AllocId` for its address? That still leaves us with a failure path, which in this case might be best-handled as an ICE? I'm just not sure that changing the return type of `miri_get_alloc_id` to `Option` is a win because it complicates all normal uses of this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Waiting for a review to complete
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants