Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move lock: in-place dependency graph updates #16788

Merged
merged 2 commits into from
Mar 21, 2024
Merged

Conversation

rvantonder
Copy link
Contributor

@rvantonder rvantonder commented Mar 21, 2024

Description

The behavior of write_to_lock in dependency_graph.rs is to create a completely new Move.lock each time dependencies change. It means that any orthogonal information in the Move.lock gets wiped / reset as well (toolchain-info, and up-and-coming address management). There are scenarios where we can repopulate this information, like we've done up to now (like toolchain-info). But it's untenable going forward and difficult to reason about. For example, with automated address management, we must preserve published environment information across CLI invocations in the lock file, we can't recreate that per build invocation. So: This PR is meant to be semantics preserving modulo formatting, but with a separation of concerns in the Move.lock-update-API. In follow up PRs this separation will be actually used.

This PR makes it so that write_to_lock in dependency graph code only populates the Move.lock with the relevant state it ought to, leaving everything else in the lock file alone. This is in contrast to the current "wipe everything, populate dependencies, and expect orthogonal information to be populated afterwards on every invocation". This applies to the following TOML values in Move.lock:

  • manifest_digest
  • deps_digest
  • dependencies (for root package)
  • dev-dependencies (for root package)
  • [[move.packages]] array of table

Because the format of the above terms have been hand-rolled, it is really (really) difficult to preserve the exact formatting.

I was able to preserve the formatting of [[move.packages]] by essentially letting us write that term as we did before, but to a string, instead of the lock file. That string is then parsed into a toml_edit value (which preserves the formatting) and persisted in the Move.lock.

The part that is just too much effort to try and preserve is the \n newline before dependencies and dev-dependencies in the [move] table section. Unlike [[move.packages]], which is a complete TOML term that we can parse from a string and preserve formatting, the \n is essentially not part of the dependencies term--we can't capture it like that. We can capture it by parsing in the whole [move] section, but this will mean merging / removing version and any other things that may be added to that section in future when we merge it with an existing lock file. Not worth it.

So: the output of the Move.lock will change here and not have those leading \ns. In general I think it is much better to stick with the automated output of the toml API than handrolling the formatting.

Test Plan

Updated tests. Lock file re-generation may remove empty lines in Move.lock and is a user-visible change, though not semantically meaningful.


If your changes are not user-facing and do not break anything, you can skip the following section. Otherwise, please briefly describe what has changed under the Release Notes section.

Type of Change (Check all that apply)

  • protocol change
  • user-visible impact
  • breaking change for a client SDKs
  • breaking change for FNs (FN binary must upgrade)
  • breaking change for validators or node operators (must upgrade binaries)
  • breaking change for on-chain data layout
  • necessitate either a data wipe or data migration

Release notes

The write_to_lock process has been adjusted to only update relevant parts of the Move.lock file instead of wiping and recreating the file with every invocation. You might notice some format changes, but you shouldn't manually edit the file.

Copy link

vercel bot commented Mar 21, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
explorer ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 21, 2024 8:28pm
multisig-toolkit ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 21, 2024 8:28pm
mysten-ui ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 21, 2024 8:28pm
sui-core ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 21, 2024 8:28pm
sui-kiosk ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 21, 2024 8:28pm
sui-typescript-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 21, 2024 8:28pm

Comment on lines +203 to +210
pub fn update_dependency_graph(
file: &mut LockFile,
manifest_digest: String,
deps_digest: String,
dependencies: Option<toml_edit::Value>,
dev_dependencies: Option<toml_edit::Value>,
packages: Option<ArrayOfTables>,
) -> Result<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates the relevant TOML values for the dependency graph in file without affecting any other part of the contents.

Currently, file passed here is always generated from scratch (as before). But in upcoming changes, file will be an existing lock file, e.g., containing automated address management info, that will now remain intact when the dependency graph is updated via this function.

Comment on lines 1237 to 1246
if !writer.is_empty() {
let toml = writer.parse::<toml_edit::Document>()?;
if let Some(value) = toml.get("dependencies").and_then(|v| v.as_value()) {
dependencies = Some(value.clone());
}
if let Some(value) = toml.get("dev-dependencies").and_then(|v| v.as_value()) {
dev_dependencies = Some(value.clone());
}
packages = toml["move"]["package"].as_array_of_tables().cloned();
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This takes the dependency graph info as a string (in writer) and parses it with toml_edit to obtain TOML values. Those TOML values preserve the hand-rolled formatting for the most part, which is the reason for this dance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these indexing operations for ["move"] and ["package"] fail? (Do we have a test for that?)

Copy link
Contributor Author

@rvantonder rvantonder Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good call out. There's an invariant here in the string construction before this code: if writer is a non-empty string, it means dependencies or dev-dependencies exist. If either of these exist, [[move.package]] exists.

In get_graph, parsing dependencies or dev-dependencies without [[move.package]] fails, so there's no way to trigger this with an incomplete Move.lock.

I have converted it to a safe access anyway. So that, if the invariant is broken we'll see snapshot tests reflect the issue, rather than a index panic.

Comment on lines +1248 to +1255
use std::io::Seek;
let mut lock = LockFile::new(
install_dir,
self.manifest_digest.clone(),
self.deps_digest.clone(),
)?;
lock.flush()?;
lock.rewind()?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still unconditionally create a Move.lock from scratch as before. But this time, we will update that Move.lock with the update_dependency_graph function and edit in-place (right after this comment).

We rewind the lock file cursor so that it's ready to be read by the update_dependency_graph function. This is how all the update_* functions are set up to work in schema.rs (i.e., they expect a file handle they can read a lock string from).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to just work with strings (or some other in-memory representation) everywhere instead of the file that you need to keep rewinding, then at the end, you can put it back into the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd like something like that, but it's tricky. In effect we'd need to thread that string along wherever we'd like to avoid a write to the file, which is intimidating--mostly because the control flow is tricky, and the points at which various things care to commit or write to the file is spread out over that control flow (meaning the callers will then read/write anyway).

I'm happy to try rework this later, but consistent updates to the lock file path this way abstracts away from the callers and avoids passing a string around. Because one thing I think doesn't work is to have, e.g., dependency_graph or any other component work with strings be tempted to decide "ok since I have a string I care about, I'm writing to the lock file right here right now!". If we can have updates go through a predictable interface (lockfile schema update* functions) then we can find/reason about stateful effects to the lock (with find-references).

Again I may revisit and have these return strings, it might make sense, though there's also a chance that change will clutter up callers with needing to write to disk at various places.

Copy link
Member

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I agree that this seems like the right thing to do. Left some questions -- the one about the potential for the indexing operations failing is worth looking at but otherwise nothing blocking!

Comment on lines 1237 to 1246
if !writer.is_empty() {
let toml = writer.parse::<toml_edit::Document>()?;
if let Some(value) = toml.get("dependencies").and_then(|v| v.as_value()) {
dependencies = Some(value.clone());
}
if let Some(value) = toml.get("dev-dependencies").and_then(|v| v.as_value()) {
dev_dependencies = Some(value.clone());
}
packages = toml["move"]["package"].as_array_of_tables().cloned();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these indexing operations for ["move"] and ["package"] fail? (Do we have a test for that?)

self.deps_digest.clone(),
)?;
let mut writer = BufWriter::new(&*lock);
use fmt::Write;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would personally just put the trait use statements at the top-level (also applies to Seek below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure! In this case, there is already a io::Write at the top that conflicts. I'm not sure there's an elegant way to overcome that?

Comment on lines +1248 to +1255
use std::io::Seek;
let mut lock = LockFile::new(
install_dir,
self.manifest_digest.clone(),
self.deps_digest.clone(),
)?;
lock.flush()?;
lock.rewind()?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to just work with strings (or some other in-memory representation) everywhere instead of the file that you need to keep rewinding, then at the end, you can put it back into the file?

@rvantonder rvantonder merged commit 3cdb1f3 into main Mar 21, 2024
41 checks passed
@rvantonder rvantonder deleted the rvt/xyz-toml-edit-dep branch March 21, 2024 20:59
leecchh pushed a commit that referenced this pull request Mar 27, 2024
## Description 

This PR makes it so that `write_to_lock` in dependency graph code only
populates the `Move.lock` with the relevant state it ought to, leaving
everything else in the lock file alone. This is in contrast to the
current "wipe everything, populate dependencies, and expect orthogonal
information to be populated afterwards on every invocation". This
applies to the following TOML values in `Move.lock`:

- `manifest_digest`
- `deps_digest`
- `dependencies` (for root package)
- `dev-dependencies` (for root package)
- `[[move.packages]]` array of table

## Test Plan 

Updated tests. Lock file re-generation may remove empty lines in
`Move.lock` and is a user-visible change, though not semantically
meaningful.

---
If your changes are not user-facing and do not break anything, you can
skip the following section. Otherwise, please briefly describe what has
changed under the Release Notes section.

### Type of Change (Check all that apply)

- [ ] protocol change
- [x] user-visible impact
- [ ] breaking change for a client SDKs
- [ ] breaking change for FNs (FN binary must upgrade)
- [ ] breaking change for validators or node operators (must upgrade
binaries)
- [ ] breaking change for on-chain data layout
- [ ] necessitate either a data wipe or data migration

### Release notes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants