Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Linking rewrite #7027

Merged
merged 30 commits into from
Feb 12, 2024
Merged

feat: Linking rewrite #7027

merged 30 commits into from
Feb 12, 2024

Conversation

klkvr
Copy link
Member

@klkvr klkvr commented Feb 6, 2024

Motivation

The way current linking implementation works is by linking each individual artifact in an isolated way. For each artifact post_link hook is getting called with the following data (PostLinkInput).

  • id of the linked artifact
  • contract containing linked contract artifact
  • known_contracts which seems to be just a mutable reference getting passed around
  • dependencies - vector of libraries that needs to be deployed for the given artifact

The dependencies vector contents is explained here:

/// A library may show up more than once, but it should *always* have the same nonce and address
/// with respect to the single `assert_dependencies` call. There should be no gaps in the nonce
/// otherwise, i.e. whenever a new dependency is encountered, the nonce should be a single
/// increment larger than the previous largest nonce.

The thing which highlights the downside of this approach is that for different artifact dependencies may include different libraries with different nonces. This is expected behavior but it results in verifiaction issues (#6506).

Solution

Considering that current approach does not really make much sense, I've decided to implement new linling algorithm to be less complex and linking just one target artifact. If we need to link more artifacts, it can just be invoked several times as it's done in tests right now.

I believe that one of the most sensitive places of the linker is conversion between ArtifactId and solc library format (src/Lib.sol:Lib) as it's related to paths. Even though we strip paths everywhere I think there might be some edge cases here and would like to do some testing in that direction.

Closes #6506
Closes #5014

Copy link
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new linling algorithm to be less complex and linking just one target artifact. If we need to link more artifacts, it can just be invoked several times as it's done in tests right now.

this approach is so much better
looks very promising already

let (new_libraries, predeploy_libraries): (Vec<_>, Vec<_>) =
run_dependencies.into_iter().unzip();
pub fn link(
&self,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need &self here or can this also be a standalone function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really need &self here, however I believe that highlevel_known_contracts and Libraries logic is specific to scripts so we can make it a standalone function but still in scope of the scripts

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if we can give this a different name e.g link_script for clarity?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Evalir you mean separate it from ScriptArgs and rename?

Copy link
Member

@Evalir Evalir Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klkvr personally don't mind either separating it or keeping it on ScriptArgs—just think the fn name can be more specific 😄

If we can make this a standalone fn outside ScriptArgs I think it would be nice, as it shows this is explictly a script-related helper.

Comment on lines 139 to 145
let mut new_libraries = Libraries { libs: BTreeMap::new() };
for (id, address) in libs_to_deploy.iter().chain(predeployed_libs.iter()) {
new_libraries
.libs
.entry(id.source.clone())
.or_default()
.insert(id.name.split('.').next().unwrap().to_owned(), address.to_string());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we need to deploy libraries, it might be nice to do it with create2 (if we don't already). Using create2 when a hardcoded create2 deployer is available on the chain seems preferable because then libs can be automatically reused across projects without needing to redeploy (see also: #4810)

Copy link
Member Author

@klkvr klkvr Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a nice idea! I was also thinking about it in scope of #6215 and #5375. I think it should be much easier to implement with less complex linker implementation, but imo it makes more sense as a separate PR as it'd be a new feature

Copy link
Member

@Evalir Evalir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great, some comments

crates/forge/src/link.rs Outdated Show resolved Hide resolved
crates/forge/src/link.rs Outdated Show resolved Hide resolved
crates/forge/src/multi_runner.rs Show resolved Hide resolved
crates/forge/src/link.rs Outdated Show resolved Hide resolved
let (new_libraries, predeploy_libraries): (Vec<_>, Vec<_>) =
run_dependencies.into_iter().unzip();
pub fn link(
&self,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if we can give this a different name e.g link_script for clarity?

mattsse pushed a commit to foundry-rs/compilers that referenced this pull request Feb 8, 2024
Adds helper method to strip path prefixes from `Libraries` object, I'll
need this for foundry-rs/foundry#7027
@klkvr klkvr force-pushed the klkvr/link-rewrite branch from 6cf6caa to f900581 Compare February 8, 2024 13:20
@klkvr
Copy link
Member Author

klkvr commented Feb 8, 2024

Linking is a bit more complex than it has to be because of the way we handle libraries.

Before passing sources into solc, we are stripping paths to start from project root (otherwise, keep global path) and once compiler finishes, we are joining all paths to start from the project root again.

The issue here is that linkReferences in contracts artifacts are never updated and still contain relative paths. Because of that, when matching libraries, we are operating on relative paths. Thus, any linking calling function should firstly ensure to strip paths for both user-provided libraries and contract artifacts

/// Links given artifact with either given library addresses or address computed from sender and
/// nonce.
///
/// Current linking implementation assumes that all link_references keys are matching the format in
/// which sources appear in [ArtifactId] keys.
///
/// You should make sure to strip paths via `with_stripped_file_prefixes` for both
/// `libraries` and `contracts` *before* invoking linker.
///
/// This function will always link target contract completely, however, in case of paths format
/// mismatch it may not recognize some of the predeployed libraries, leading to more deployments
/// than needed.

@klkvr
Copy link
Member Author

klkvr commented Feb 8, 2024

UPD: I've updated linker logic to accept root path and strip all paths while matching link_references to artifacts, that way we now only assume that libraries provided into linker have correct paths without remappings

Copy link
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits,

test coverage looks good.

ptal @Evalir @DaniPopes

crates/forge/src/link.rs Outdated Show resolved Hide resolved
crates/forge/src/link.rs Outdated Show resolved Hide resolved
crates/forge/src/link.rs Outdated Show resolved Hide resolved
crates/forge/src/link.rs Outdated Show resolved Hide resolved
crates/forge/src/link.rs Outdated Show resolved Hide resolved
crates/forge/src/link.rs Outdated Show resolved Hide resolved
crates/forge/src/link.rs Outdated Show resolved Hide resolved
crates/forge/src/multi_runner.rs Outdated Show resolved Hide resolved
@klkvr klkvr marked this pull request as ready for review February 8, 2024 17:49
@klkvr klkvr requested a review from DaniPopes as a code owner February 8, 2024 17:49
Copy link
Member

@Evalir Evalir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the code is looking great to me bar mattsse's nits. I think what we want is probably manual testing with big projects (Maple, spark protocol) that historically have had problems with our linker. I can gather a few of these closed issues so we can test.

I also think that, for next steps, we should consider extracting the core linking code into a new crate, and maybe exposing it through a struct Linker, instead of having multiple loose fns, some public, some private?

Cargo.toml Outdated Show resolved Hide resolved
crates/forge/bin/cmd/script/build.rs Show resolved Hide resolved
crates/forge/bin/cmd/script/build.rs Show resolved Hide resolved
@klkvr
Copy link
Member Author

klkvr commented Feb 8, 2024

I also think that, for next steps, we should consider extracting the core linking code into a new crate, and maybe exposing it through a struct Linker, instead of having multiple loose fns, some public, some private?

Yep, it makes sense, basically Linker needs just Project and ProjectCompileOutput to be initialized and can be made much clearner than current functions

@klkvr
Copy link
Member Author

klkvr commented Feb 9, 2024

@Evalir @mattsse so I've refactored the code to have linking logic in Linker struct, it also allowed to simplify linking logic in script a little bit.

I've tested it on Aave-like codebase with several external libraries and fould couple bugs with how we collect known_contracts for scripts and tests, all of those should now be fixed, but I'll test for some more time.

Regarding link_script_target fn: what is does is links script target via link_with_address_or_nonce and then collects highlevel_known_contracts - set of contracts with fully-linked non-empty bytecode which is used for verification and traces decoding. I think that it's pretty specific and it makes sense to leave it in script for now as it has become much shorter after refactoring and should be readable

Copy link
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last suggestions

lgtm

pending @DaniPopes

any additional issues this might close?

Comment on lines 318 to 320
.deployed_bytecode
.and_then(|d_bcode| d_bcode.bytecode)
.and_then(|bcode| bcode.object.into_bytes())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be simplified with .get_deployed_bytecode_bytes().cloned()

we could also clean this up on foundry-compilers so that this function no longer returns Cow, not useful anyway

Comment on lines 301 to 303
let Some(bytecode) = linked_contract
.bytecode
.and_then(|b| b.object.into_bytes())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is also linked_contract.get_bytecode_bytes()

@klkvr
Copy link
Member Author

klkvr commented Feb 9, 2024

@mattsse this should close #6506, not sure if it's a duplicate of something

@klkvr
Copy link
Member Author

klkvr commented Feb 9, 2024

Also closes #5014, added test for it

@mattsse mattsse changed the title [WIP] Linking rewrite feat: Linking rewrite Feb 9, 2024
Copy link
Member

@onbjerg onbjerg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks better, although I wonder if we should get rid of eyre in the linker. imo eyre is more for CLI stuff, this should be more strongly typed. wdyt?

crates/forge/src/link.rs Outdated Show resolved Hide resolved
// user-provided paths to be able to match them correctly.
let mut libraries = libraries.with_stripped_file_prefixes(self.root.as_path());

let mut needed_libraries = BTreeSet::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I was wondering if this was safe, because you can have a library that itself is unlinked, so it's not deployable until all those unlinked references are resolved.

However at second thought, this is safe, because we know the addresses in such a scenario ahead of time, as long as it is deployed in the correct order.

I feel like we should add a note about that, wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it's safe because libraries doesn't use its dependencies during deployment

there is already a note about the deployment order, do you think it should be extended?

/// The order in which they appear in the vector is the order in which they should be deployed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it's prob fine

@Evalir Evalir self-requested a review February 9, 2024 17:24
Co-authored-by: Bjerg <onbjerg@users.noreply.github.com>
@klkvr
Copy link
Member Author

klkvr commented Feb 9, 2024

imo eyre is more for CLI stuff, this should be more strongly typed

currently linker errors are more like panics and getting a error there probably means that the code invoking it didn't sanitize inputs normally, so it's not something we would need to catch

but we might be adding more features to linker (like that one) and for them we might have broader errors spectrum, so yeah it probably makes sense to add errors enum

Copy link
Member

@DaniPopes DaniPopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg tm

crates/forge/src/multi_runner.rs Outdated Show resolved Hide resolved
@mattsse
Copy link
Member

mattsse commented Feb 12, 2024

only rustfmt, missing then send it @klkvr 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants