Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add file locks to prevent git checkout races #3484

Closed
wants to merge 29 commits into from
Closed

Conversation

eightfilms
Copy link
Contributor

@eightfilms eightfilms commented Dec 1, 2022

Closes #2603
Related #919

Introduces Filesystem abstraction as well as locking. Mostly inspired by cargo's flock but using file-lock instead.

EDIT: We've abandoned the attempt to introduce Filesystem abstraction (see comment below). This PR will introduce locks only, and instead of file-lock, we will use fd-lock.

Similar to cargo we open a .forc-lock file at the root of the directory we want to lock - this is unlocked on Drop.

Tested this on a fork of fuels-rs in my personal repo here. This is using the same setup in the issue raised here: #2603

On first glance in terms of speed, it doesn't seem to offer much improvement over the time spent building contracts when ran with -n 1, at least from just glancing at the duration of the build step in the fuels-rs repo. Improvements in speed aside, introducing fd-lock at least seems to allow concurrent builds.

@eightfilms eightfilms added the forc label Dec 1, 2022
@eightfilms eightfilms self-assigned this Dec 1, 2022
@eightfilms eightfilms marked this pull request as ready for review December 1, 2022 12:08
@eightfilms eightfilms requested a review from a team December 1, 2022 12:08
Copy link
Contributor

@mitchmindtree mitchmindtree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this @bingcicle!

I'm not sure the Filesystem abstraction is actually providing us much extra safety here, as its trivial for future contributions to ignore the lock and read/write files as they see fit, and Rust doesn't give us any other way to guard against I/O 🤔

That said, I'm also not sure we really need that level of safety at the moment, as our point of contention appears to be in a single place - git checkouts.

What are your thoughts on removing forc-fs (and the Filesystem abstraction), and simply using fslock or fd-lock directly within fetch_git? We should still link your commit with forc-fs to #919 as it might make for a good foundation if we need a more sophisticated Filesystem abstraction in the future.

description = "Filesystem abstraction for Forc"

[dependencies]
file-lock = "2.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what prompted the choice to use file-lock here? It looks like there are a few alternatives that have a lot more use like fslock or fd-lock - maybe it's worth switching to one of these more established options?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I saw fslock but not fd-lock. file-lock seemed at the time to me was more mature than fslock, but it seems like fd-lock is a better alternative since its from a core maintainer. Will re-write using that.

let path = git_commit_path(name, &pinned.source.repo, &pinned.commit_hash);
// Checkout the pinned hash to the path.
with_tmp_git_repo(fetch_id, name, &pinned.source, |repo| {
// Change HEAD to point to the pinned commit.
let id = git2::Oid::from_str(&pinned.commit_hash)?;
repo.set_head_detached(id)?;

let _lock = path.open_rw(".forc-lock")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a const.

@eightfilms
Copy link
Contributor Author

eightfilms commented Dec 2, 2022

I'm not sure the Filesystem abstraction is actually providing us much extra safety here, as its trivial for future contributions to ignore the lock and read/write files as they see fit, and Rust doesn't give us any other way to guard against I/O

Yeah, while looking into locks it seems like advisory locks are only as strong as the maintainers choose to abide by them. We would have to enforce using forc-fs and extending that over using the standard fs. From looking through the code that forc-fs could replace there weren't much places that would benefit from abstraction as well (for now).

What are your thoughts on removing forc-fs (and the Filesystem abstraction), and simply using fslock or fd-lock directly within fetch_git? We should still link your commit with forc-fs to #919 as it might make for a good foundation if we need a more sophisticated Filesystem abstraction in the future.

This sounds good, I'll edit the original PR content to close only #2603 then and link it to #919 instead.

@eightfilms eightfilms changed the title introduce filesystem abstraction in the form of forc-fs add file locks to prevent git checkout races Dec 2, 2022
fs::OpenOptions::new()
.write(true)
.create(true)
.open(path.join(FORC_FILE_LOCK_NAME))?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move this expression into a re-usable function? E.g.

/// Given the path to a directory, lock the directory using a file lock.
///
/// The directory is "unlocked" when the returned `RwLock` is dropped.
///
/// This was introduced to avoid racing on the creation of git checkout directories.
fn lock_dir(dir_path: &Path) -> io::Result<RwLock<File>> {
    let file = fs::OpenOptions::new()
        .write(true)
        .create(true)
        .open(path.join(FORC_FILE_LOCK_NAME))?;
    Ok(RwLock::new(file))
}

);
let _ = lock.write()?;

let _ = std::fs::remove_dir_all(&repo_dir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, if we remove the directory here, don't we also remove our lock file?

Copy link
Contributor Author

@eightfilms eightfilms Dec 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably convert this to a draft - I've realized now that in my original PR it was incorrectly implemented as well. Also, file-lock is based on fnctl() which is different from fd-lock which is based on flock(). For whatever reason initializing the lock at the same line did not work for fd-lock - still looking into this and trying to understand how flock() compares against fnctl().

This was me just moving the lock around to see which would work

.open(path.join(FORC_FILE_LOCK_NAME))?,
);
let _ = lock.write()?;
let _ = std::fs::remove_dir_all(&path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here?

@eightfilms eightfilms marked this pull request as draft December 5, 2022 03:18
mitchmindtree added a commit that referenced this pull request Dec 14, 2022
This introduces an advisory lock around the re-usable git checkout
repo directories.

Closes #2603, #3484.
mitchmindtree added a commit that referenced this pull request Dec 14, 2022
This introduces an advisory lock around the re-usable git checkout
repo directories.

Closes #2603, #3484.
@eightfilms
Copy link
Contributor Author

Closing this in favor of @mitchmindtree 's #3592.

@eightfilms eightfilms closed this Dec 15, 2022
@eightfilms eightfilms deleted the bing/forc-fs branch December 15, 2022 09:31
mitchmindtree added a commit that referenced this pull request Dec 15, 2022
This introduces an advisory lock around the re-usable git checkout repo
directories.

Closes #2603, #3484.

Unfortunately, it's a little tricky to reliably test this from this
project repository as doing so requires creating a test project with one
or more `git` dependencies. We can't reliably add git dependencies (that
point to external repos) to projects within this repo as the dependency
would necessarily be at least one commit of forc behind, and as a result
it's likely that these dependencies would break at some point in the
future, e.g. next time forc makes some breaking change.

@segfault-magnet, @hal3e would you mind testing this branch in your
use-case mentioned in #2603 to make sure we've covered the issue?

Edit: I've managed to test this locally by adding some long `sleep`s
immediately after acquiring the lock guard, and then spinning up
multiple processes of `forc` locally to build a toy project with
multiple git dependencies. All appears to work fine, with each process
blocking and waiting its turn for access to the checkout repo.

Co-authored-by: bing <bingcicle@proton.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Running multiple instances of forc concurrently can (very occasionally) cause a git checkout race
2 participants