-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use mutex to avoid concurrent btrfs operations #43
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Might help with problems such as this: ``` [11030132.006555] INFO: task ocluster-worker:602217 blocked for more than 120 seconds. [11030132.015596] Not tainted 5.4.0-40-generic ocurrent#44-Ubuntu [11030132.022547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11030132.032061] ocluster-worker D 0 602217 1 0x00004000 [11030132.032069] Call Trace: [11030132.032092] __schedule+0x2e3/0x740 [11030132.032106] ? __switch_to_asm+0x40/0x70 [11030132.032116] ? __switch_to_asm+0x34/0x70 [11030132.032126] schedule+0x42/0xb0 [11030132.032130] schedule_preempt_disabled+0xe/0x10 [11030132.032132] __mutex_lock.isra.0+0x182/0x4f0 [11030132.032142] ? try_to_del_timer_sync+0x54/0x80 [11030132.032145] __mutex_lock_slowpath+0x13/0x20 [11030132.032148] mutex_lock+0x2e/0x40 [11030132.032199] btrfs_start_delalloc_roots+0x60/0x280 [btrfs] [11030132.032238] flush_space+0x5dd/0x740 [btrfs] [11030132.032281] ? lock_extent_buffer_for_io+0x370/0x370 [btrfs] [11030132.032325] ? __clear_extent_bit+0x201/0x4a0 [btrfs] [11030132.032372] priority_reclaim_metadata_space.isra.0+0x18b/0x220 [btrfs] [11030132.032429] ? can_overcommit.part.0+0x5f/0xc0 [btrfs] [11030132.032466] btrfs_reserve_metadata_bytes+0x578/0x950 [btrfs] [11030132.032501] ? btrfs_truncate_inode_items+0x35e/0xdb0 [btrfs] [11030132.032505] ? __mutex_lock.isra.0+0x429/0x4f0 [11030132.032557] ? __btrfs_block_rsv_release+0x1c1/0x300 [btrfs] [11030132.032595] btrfs_block_rsv_refill+0x7d/0xa0 [btrfs] [11030132.032628] evict_refill_and_join+0x39/0xd0 [btrfs] [11030132.032670] btrfs_evict_inode+0x417/0x4c0 [btrfs] [11030132.032689] evict+0xd2/0x1b0 [11030132.032698] iput+0x148/0x210 [11030132.032708] dentry_unlink_inode+0xc6/0x110 [11030132.032720] d_delete+0x76/0x80 [11030132.032727] vfs_rmdir+0x179/0x1a0 [11030132.032732] do_rmdir+0x18c/0x1c0 [11030132.032736] __x64_sys_rmdir+0x17/0x20 [11030132.032744] do_syscall_64+0x57/0x190 [11030132.032747] entry_SYSCALL_64_after_hwframe+0x44/0xa9 ```
Didn't fix it:
|
talex5
added a commit
to talex5/opam-repository
that referenced
this pull request
Dec 30, 2020
CHANGES: - Add support for nested / multi-stage builds (@talex5 ocurrent/obuilder#48 ocurrent/obuilder#49). This allows you to use a large build environment to create a binary and then copy that into a smaller runtime environment. It's also useful to get better caching if two things can change independently (e.g. you want to build your software and also a linting tool, and be able to update either without rebuilding the other). - Add healthcheck feature (@talex5 ocurrent/obuilder#52). - Checks that Docker is running. - Does a test build using busybox. - Clean up left-over runc containers on restart (@talex5 ocurrent/obuilder#53). If btrfs crashes and makes the filesystem read-only then after rebooting there will be stale runc directories. New jobs with the same IDs would then fail. - Remove dependency on dockerfile (@talex5 ocurrent/obuilder#51). This also allows us more control over the formatting (e.g. putting a blank line between stages in multi-stage builds). - Record log output from docker pull (@talex5 ocurrent/obuilder#46). Otherwise, it's not obvious why we've stopped at a pull step, or what is happening. - Improve formatting of OBuilder specs (@talex5 ocurrent/obuilder#45). - Use seccomp policy to avoid necessary sync operations (@talex5 ocurrent/obuilder#44). Sync operations are really slow on btrfs. They're also pointless, since if the computer crashes while we're doing a build then we'll just throw it away and start again anyway. Use a seccomp policy that causes all sync operations to "fail", with errno 0 ("success"). On my machine, this reduces the time to `apt-get install -y shared-mime-info` from 18.5s to 4.7s. Use `--fast-sync` to enable to new behaviour (it requires runc 1.0.0-rc92). - Use a mutex to avoid concurrent btrfs operations (@talex5 ocurrent/obuilder#43). Btrfs deadlocks enough as it is. Don't stress it further by trying to do two things at once. Internal changes: - Improve handling of file redirections (@talex5 ocurrent/obuilder#46). Instead of making the caller do all the work of closing the file descriptors safely, add an `FD_move_safely` mode. - Travis tests: ensure apt cache is up-to-date (@talex5 ocurrent/obuilder#50).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Might help with problems such as this: