Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EXPERIMENT] Ownership analysis [DO NOT MERGE] #124756

Closed
wants to merge 11 commits into from

Conversation

xFrednet
Copy link
Member

@xFrednet xFrednet commented May 5, 2024

TL;DR: I want to collect data about ownership in Rust's ecosystem. I modified Clippy to extract things from MIR. This PR is intended for a crater run, but should definitely not be merged.


Roses are red,
Violets are blue,
Can I borrow,
A pattern from you?

So, what is this project all about? Well, last year I was looking for a topic for my master's thesis. I had the incredible luck that the programming languages group at my university is not only a collection of incredibly cool people but also has some interest in Rust. It just so happens that @amandasystems is part of the group.

We then spent some time looking for a thesis project in the domain of Rust's memory ownership model, and finally settled on the idea to analyze how the memory ownership model is used in practice.

For Polonius, it could be interesting to know how many bodies contain borrows and if the changes from the NLL RFC are actually used, etc.

Implementation

Disclaimer, this PR contains some of the most convoluted and spaghettified code I've ever written. Do not use this as an example for anything! I'm pretty sure that the insufficiencies of this PR single-handedly increases the compile time by 100%. It's honestly a miracle that the whole thing doesn't ICE on every function.

So at the beginning, I decided to use MIR as that is the IR that the borrow check uses, and it contains all the borrows and moves which might not be explicit in the HIR. While all nice and good it turns out that MIR also gets writ of information which I would have wanted for my analysis.

To be clear, MIR is an excellent tool for what it was designed to do, but its loss of information has been a struggle for this project specifically.

I would like to collect way more metadata, more complex patterns, and data how named references are used, but the deadline for my thesis is coming up. We decided to limit the scope to named owned variables and see how the analysis goes. If I find some interesting things there is potential to extend this work.

Run

I decided to implement the analysis on Clippy for the shorter compile times and familiar codebase. This branch disables all other lints by default to safe on some runtime. The crater run with Clippy from master might also expose some ICEs which could be interesting for us.

The results of this will be aggregated and published as part of my thesis. The time imposed limitations of this implementation probably mean that the results have to be taken with a metric ton of salt, but they should be sufficient to pass my thesis and know if more work in this area would be interesting.

I don't know if there is much more too to say. Let's hope:

  • The CI passes
  • This run goes smoothly
  • Results in some interesting data
  • Teaches me how to implement the whole thing better the next time.

r? @ghost

Alexendoo and others added 9 commits May 1, 2024 11:35
`assigning_clones`: add empty line to doc

changelog: none

This PR adds, for consistency reasons, an empty line to the example in the doc of the `assigning_clones` lint.
…1995

Fix `FormatArgs` storage when `-Zthreads` > 1

Fixes rust-lang#11886

The initial way I thought of was a little gross so I never opened a PR for it, I thought of a nicer way today that no longer involves any `thread_local`s or `static`s

`rustc_data_strucutres::sync::{Lrc, OnceLock}` implement `DynSend` + `DynSync` so we can pass them to the lint passes that need the storage

changelog: none

r? `@flip1995`
…e-init, r=dswij

Don't lint assigning_clones on nested late init locals

Fixes rust-lang#12741

changelog: none
More testing or something

Duck

Checking more bbs

Checking more bbs

Some progress IIRC

Handle more teminators

Diving into RValue evaluation

I have no idea what I'm doing

It doesn't crash I guess?

Finding the first pattern

Refactoring: Extract pattern from automata to simplify states

Remove old value usage based pattern detection

Refactoring: Better Automata structure

More automator nodes :D

More states and small progress

Detecting conditional assignments

Event posting refactoring

Adding test output and testing temp borrows

Working on field borrows and cleanup :D

Different errors yay

Use `mir::Place` for events

More patterns and more states, just need some pats :3

Getting the first fricking region

Collect dependent types for return tys

Somehow handle deref calls 😭

Cleanup AnonRef state thing by using an info struct wrapper

General Cleanup

Working the return state machine

Testing borrows in HIR format...

Working on something I guess

Having a direction to run against walls

Control flow information :D

CFG with ranges..

CFG: resonable downgrade

CFG: Start

Caching return Relations

I love refactoring in Rust :D

Detecting first returns

Handling more returns

Tracking assignments better

Collecting local information

Refactor metadata collection for MIR

The first return patterns

Cleanup console output with envs

Fix `LocalInfo` for arguments

Some refactorings again

Check unused for owned values

Complaining about MIR again

Why is there a no-op Drop????

Determine a nice traversal order

Probably progress??

Detecting manual drops and moves into functions

Fix ICE for item names

Remove old automata based implementation

Dogfood my code (52 Errors...)

Detecting temp borrows

Abort on feature use

Handling assignments better

Unify arguments and variables

Remove RC from ownership patterns

Better anon ref tracking

Stash that thing

Better function call relation calculation

Track relations between mut inputs as well :D

Correct `has_projection()` usage

Detecting some patterns in function params

Detecting two phase borrows

Owned `StateInfo` refactoring

Update todos

Generalize visitors for analyzers

Tracking all assignments

Dataflow analysis

Dataflow analysis with `&mut` loans

Dataflow refactoring, this only considers locals

Owned: Detect `ModMutShadowUnmut`

Reorganize UI tests

Include more type information in `VarInfo`

Better track return types

Refactorings for borrow tracking

Allow nested two phase borrows

Identify named borrows for owned values better

Correct owned value stat tracking

Fill body pats based on stats

Pause `Return` pattern for owned values

Modeling the state as a list

Detecting the first patterns with state merging

Detecting conditional moves and drops

Mark more tests as passing

Better detect overwrites for owned values

Improve Drop impl info

Remove `Unused` detection due to inconsistency

Update docs and move test to pass

Refactored `state.borrows`

Handle double refs correctly

Dealing with ref-refs and renaming temp to arg borows

Add `Moved` as a base pattern

Handle part moves

Better separation between borrow and part borrows

Track possibly missed relations

Closures

Add simple closure ty

Snapshot

Ctor things and stuff

Remove lint level gate

Avoiding `rand` crashes 🎉

Aggregate stats for bodies

Jsonfy analysis output

Update todos

Improving traversal construction and running regex :D

Type cleanupg

Feeding the dog
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 5, 2024
@rust-log-analyzer

This comment has been minimized.

@xFrednet
Copy link
Member Author

xFrednet commented May 5, 2024

It's green :D

Now let's see bors destroy all my hopes and dreams. /s JK love you bors

@bors try

bors added a commit to rust-lang-ci/rust that referenced this pull request May 5, 2024
[EXPERIMENT] Ownership analysis [DO NOT MERGE]

TL;DR: I want to collect data about ownership in Rust's ecosystem. I modified Clippy to extract things from MIR. This PR is intended for a crater run, but should definitely not be merged.

---

*Roses are red,*
*Violets are blue,*
*Can I borrow,*
*A pattern from you?*

So, what is this project all about? Well, last year I was looking for a topic for my master's thesis. I had the incredible luck that the programming languages group at my university is not only a collection of incredibly cool people but also has some interest in Rust. It just so happens that `@amandasystems` is part of the group.

We then spent some time looking for a thesis project in the domain of Rust's memory ownership model, and finally settled on the idea to analyze how the memory ownership model is used in practice.

For Polonius, it could be interesting to know how many bodies contain borrows and if the changes from the NLL RFC are actually used, etc.

### Implementation

Disclaimer, this PR contains some of the most convoluted and spaghettified code I've ever written. Do not use this as an example for anything! I'm pretty sure that the insufficiencies of this PR single-handedly increases the compile time by 100%. It's honestly a miracle that the whole thing doesn't ICE on every function.

So at the beginning, I decided to use MIR as that is the IR that the borrow check uses, and it contains all the borrows and moves which might not be explicit in the HIR. While all nice and good it turns out that MIR also gets writ of information which I would have wanted for my analysis.

To be clear, MIR is an excellent tool for what it was designed to do, but its loss of information has been a struggle for this project specifically.

I would like to collect way more metadata, more complex patterns, and data how named references are used, but the deadline for my thesis is coming up. We decided to limit the scope to named owned variables and see how the analysis goes. If I find some interesting things there is potential to extend this work.

### Run

I decided to implement the analysis on Clippy for the shorter compile times and familiar codebase. This branch disables all other lints by default to safe on some runtime. The crater run with Clippy from master might also expose some ICEs which could be interesting for us.

The results of this will be aggregated and published as part of my thesis. The time imposed limitations of this implementation probably mean that the results have to be taken with a metric ton of salt, but they should be sufficient to pass my thesis and know if more work in this area would be interesting.

I don't know if there is much more too to say. Let's hope:
* The CI passes
* This run goes smoothly
* Results in some interesting data
* Teaches me how to implement the whole thing better the next time.

---

r? `@ghost`
@bors
Copy link
Contributor

bors commented May 5, 2024

⌛ Trying commit 813c97a with merge 541a4da...

@bors
Copy link
Contributor

bors commented May 5, 2024

☀️ Try build successful - checks-actions
Build commit: 541a4da (541a4da3b58309ad64ca0d6b33aabfba94b445bb)

@Mark-Simulacrum
Copy link
Member

@craterbot run mode=clippy

@craterbot
Copy link
Collaborator

👌 Experiment pr-124756 created and queued.
🤖 Automatically detected try build 541a4da
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-crater Status: Waiting on a crater run to be completed. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 5, 2024
@Mark-Simulacrum
Copy link
Member

@craterbot abort

Need to add clippy to try CI.

@craterbot
Copy link
Collaborator

🗑️ Experiment pr-124756 deleted!

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels May 5, 2024
@xFrednet
Copy link
Member Author

xFrednet commented May 5, 2024

@bors try

@bors
Copy link
Contributor

bors commented May 5, 2024

⌛ Trying commit 4d7e11d with merge a353969...

bors added a commit to rust-lang-ci/rust that referenced this pull request May 5, 2024
[EXPERIMENT] Ownership analysis [DO NOT MERGE]

TL;DR: I want to collect data about ownership in Rust's ecosystem. I modified Clippy to extract things from MIR. This PR is intended for a crater run, but should definitely not be merged.

---

*Roses are red,*
*Violets are blue,*
*Can I borrow,*
*A pattern from you?*

So, what is this project all about? Well, last year I was looking for a topic for my master's thesis. I had the incredible luck that the programming languages group at my university is not only a collection of incredibly cool people but also has some interest in Rust. It just so happens that `@amandasystems` is part of the group.

We then spent some time looking for a thesis project in the domain of Rust's memory ownership model, and finally settled on the idea to analyze how the memory ownership model is used in practice.

For Polonius, it could be interesting to know how many bodies contain borrows and if the changes from the NLL RFC are actually used, etc.

### Implementation

Disclaimer, this PR contains some of the most convoluted and spaghettified code I've ever written. Do not use this as an example for anything! I'm pretty sure that the insufficiencies of this PR single-handedly increases the compile time by 100%. It's honestly a miracle that the whole thing doesn't ICE on every function.

So at the beginning, I decided to use MIR as that is the IR that the borrow check uses, and it contains all the borrows and moves which might not be explicit in the HIR. While all nice and good it turns out that MIR also gets writ of information which I would have wanted for my analysis.

To be clear, MIR is an excellent tool for what it was designed to do, but its loss of information has been a struggle for this project specifically.

I would like to collect way more metadata, more complex patterns, and data how named references are used, but the deadline for my thesis is coming up. We decided to limit the scope to named owned variables and see how the analysis goes. If I find some interesting things there is potential to extend this work.

### Run

I decided to implement the analysis on Clippy for the shorter compile times and familiar codebase. This branch disables all other lints by default to safe on some runtime. The crater run with Clippy from master might also expose some ICEs which could be interesting for us.

The results of this will be aggregated and published as part of my thesis. The time imposed limitations of this implementation probably mean that the results have to be taken with a metric ton of salt, but they should be sufficient to pass my thesis and know if more work in this area would be interesting.

I don't know if there is much more too to say. Let's hope:
* The CI passes
* This run goes smoothly
* Results in some interesting data
* Teaches me how to implement the whole thing better the next time.

---

r? `@ghost`
@bors
Copy link
Contributor

bors commented May 5, 2024

☀️ Try build successful - checks-actions
Build commit: a353969 (a353969eafa00eade807d5bd90455d3bc3e4a489)

@Mark-Simulacrum
Copy link
Member

@craterbot run mode=clippy

@craterbot
Copy link
Collaborator

👌 Experiment pr-124756 created and queued.
🤖 Automatically detected try build a353969
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-crater Status: Waiting on a crater run to be completed. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 5, 2024
@Mark-Simulacrum
Copy link
Member

@craterbot p=1

@craterbot
Copy link
Collaborator

📝 Configuration of the pr-124756 experiment changed.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@xFrednet
Copy link
Member Author

xFrednet commented May 12, 2024

Thank you @Mark-Simulacrum, now it will run before the beta-1.79-2 test, should be increase the prio of that one to make sure that it runs first or is it not time critical?

@craterbot
Copy link
Collaborator

🚧 Experiment pr-124756 is now running

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot
Copy link
Collaborator

🎉 Experiment pr-124756 is completed!
📊 30230 regressed and 17032 fixed (446250 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels May 14, 2024
@xFrednet
Copy link
Member Author

I'm closing this since the experiment is done running. Thank you for kicking off crater!

Now I'll need to take some time to look at the data. :)

@xFrednet xFrednet closed this May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants