Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let zfs mount all tolerate in-progress mounts #8881

Merged
merged 1 commit into from
Jun 22, 2019
Merged

Let zfs mount all tolerate in-progress mounts #8881

merged 1 commit into from
Jun 22, 2019

Conversation

don-brady
Copy link
Contributor

Motivation and Context

The zfs-mount service can unexpectedly fail to start when zfs encounters a mount that is in progress. This service uses zfs mount -a, which has a window between the time it checks if the dataset was mounted and when the actual mount (via mount.zfs binary) occurs.

Description

Simple way to demonstrate this in-progress mount window:

$ for i in {1..20}; do sudo zfs create dozer/ds$i; done
$ sudo zfs unmount dozer/ds20
$ sudo zfs mount -a & sudo zfs mount -a &
[1] 8568
[2] 8569
$
filesystem 'dozer/ds20' is already mounted
cannot mount 'dozer/ds20': mountpoint or dataset is busy
[1]-  Exit 1                  sudo cmd/zfs/zfs mount -a
[2]+  Done                    sudo cmd/zfs/zfs mount -a

The suggested solution is to check if the failure was due to an in-progress mount and not treat it as an error.

How Has This Been Tested?

Manually tested the simple case described above to confirm zfs mount -a is working as intended.

Tested with a suite of automated test that was routinely encountering this error. A journalctl audit confirms that the zfs-mount service no longer fails to start when it encounters an in-progress mount.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

Signed-off-by: Don Brady <don.brady@delphix.com>
@don-brady don-brady added the Status: Code Review Needed Ready for review and testing label Jun 11, 2019
@codecov
Copy link

codecov bot commented Jun 11, 2019

Codecov Report

Merging #8881 into master will increase coverage by 0.1%.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #8881     +/-   ##
=========================================
+ Coverage   78.65%   78.76%   +0.1%     
=========================================
  Files         382      382             
  Lines      117791   117785      -6     
=========================================
+ Hits        92652    92771    +119     
+ Misses      25139    25014    -125
Flag Coverage Δ
#kernel 79.32% <ø> (+0.06%) ⬆️
#user 67.49% <50%> (+0.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b873825...461db88. Read the comment docs.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think including this additional resiliency makes good sense. But I'd like to better understand why you were seeing concurrent mounts at all. zfs mount -a should only be run once by the zfs-mount service. Which is conflicting with an unrelated zfs mount in your environment?

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jun 12, 2019
@behlendorf
Copy link
Contributor

As I understand it, the reason we're seeing these racing mounts is that both the zfs-mount.target and zfs-share.target are allowed to execute concurrently after the import. This is likely more of an issue with the relatively recent addition of parallel mounting.

@aerusso I was hoping to get your thoughts of the best way to handle this from a systemd perspective. We used to run the zfs-share.target as After=zfs-mount.service, but it was converted to Wants=zfs-mount.service in commit 80b4852. Does it make sense to consider converting it back, or will this break some other use case?

@behlendorf behlendorf merged commit a9cd8bf into openzfs:master Jun 22, 2019
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 13, 2019
The zfs-mount service can unexpectedly fail to start when zfs
encounters a mount that is in progress. This service uses
zfs mount -a, which has a window between the time it checks if
the dataset was mounted and when the actual mount (via mount.zfs
binary) occurs.

The reason for the racing mounts is that both zfs-mount.target
and zfs-share.target are allowed to execute concurrently after
the import.  This is more of an issue with the relatively recent
addition of parallel mounting, and we should consider serializing
the mount and share targets.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes openzfs#8881
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 22, 2019
The zfs-mount service can unexpectedly fail to start when zfs
encounters a mount that is in progress. This service uses
zfs mount -a, which has a window between the time it checks if
the dataset was mounted and when the actual mount (via mount.zfs
binary) occurs.

The reason for the racing mounts is that both zfs-mount.target
and zfs-share.target are allowed to execute concurrently after
the import.  This is more of an issue with the relatively recent
addition of parallel mounting, and we should consider serializing
the mount and share targets.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes openzfs#8881
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 23, 2019
The zfs-mount service can unexpectedly fail to start when zfs
encounters a mount that is in progress. This service uses
zfs mount -a, which has a window between the time it checks if
the dataset was mounted and when the actual mount (via mount.zfs
binary) occurs.

The reason for the racing mounts is that both zfs-mount.target
and zfs-share.target are allowed to execute concurrently after
the import.  This is more of an issue with the relatively recent
addition of parallel mounting, and we should consider serializing
the mount and share targets.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes openzfs#8881
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 17, 2019
The zfs-mount service can unexpectedly fail to start when zfs
encounters a mount that is in progress. This service uses
zfs mount -a, which has a window between the time it checks if
the dataset was mounted and when the actual mount (via mount.zfs
binary) occurs.

The reason for the racing mounts is that both zfs-mount.target
and zfs-share.target are allowed to execute concurrently after
the import.  This is more of an issue with the relatively recent
addition of parallel mounting, and we should consider serializing
the mount and share targets.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes openzfs#8881
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 18, 2019
The zfs-mount service can unexpectedly fail to start when zfs
encounters a mount that is in progress. This service uses
zfs mount -a, which has a window between the time it checks if
the dataset was mounted and when the actual mount (via mount.zfs
binary) occurs.

The reason for the racing mounts is that both zfs-mount.target
and zfs-share.target are allowed to execute concurrently after
the import.  This is more of an issue with the relatively recent
addition of parallel mounting, and we should consider serializing
the mount and share targets.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes openzfs#8881
tonyhutter pushed a commit that referenced this pull request Sep 26, 2019
The zfs-mount service can unexpectedly fail to start when zfs
encounters a mount that is in progress. This service uses
zfs mount -a, which has a window between the time it checks if
the dataset was mounted and when the actual mount (via mount.zfs
binary) occurs.

The reason for the racing mounts is that both zfs-mount.target
and zfs-share.target are allowed to execute concurrently after
the import.  This is more of an issue with the relatively recent
addition of parallel mounting, and we should consider serializing
the mount and share targets.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #8881
behlendorf added a commit to behlendorf/zfs that referenced this pull request May 25, 2020
This reverts commit a9cd8bf which introduced a segfault when running
`zfs mount -a` multiple times when there are mountpoints which are
not empty.  This segfault is now seen frequently by the CI after the
mount code was updated to directly call mount(2).

The original reason this logic was added is described in openzfs#8881.
Since then the systemd `zfs-share.target` has been updated to run
"After" the `zfs-mount.server` which should avoid this issue.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#9560
behlendorf added a commit to behlendorf/zfs that referenced this pull request May 26, 2020
This reverts commit a9cd8bf which introduced a segfault when running
`zfs mount -a` multiple times when there are mountpoints which are
not empty.  This segfault is now seen frequently by the CI after
the mount code was updated to directly call mount(2).

The original reason this logic was added is described in openzfs#8881.
Since then the systemd `zfs-share.target` has been updated to run
"After" the `zfs-mount.server` which should avoid this issue.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#9560
behlendorf added a commit that referenced this pull request May 26, 2020
This reverts commit a9cd8bf which introduced a segfault when running
`zfs mount -a` multiple times when there are mountpoints which are
not empty.  This segfault is now seen frequently by the CI after
the mount code was updated to directly call mount(2).

The original reason this logic was added is described in #8881.
Since then the systemd `zfs-share.target` has been updated to run
"After" the `zfs-mount.server` which should avoid this issue.

Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9560
Closes #10364
as-com pushed a commit to as-com/zfs that referenced this pull request Jun 20, 2020
This reverts commit a9cd8bf which introduced a segfault when running
`zfs mount -a` multiple times when there are mountpoints which are
not empty.  This segfault is now seen frequently by the CI after
the mount code was updated to directly call mount(2).

The original reason this logic was added is described in openzfs#8881.
Since then the systemd `zfs-share.target` has been updated to run
"After" the `zfs-mount.server` which should avoid this issue.

Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#9560
Closes openzfs#10364 
(cherry picked from commit d1b84da)
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
This reverts commit a9cd8bf which introduced a segfault when running
`zfs mount -a` multiple times when there are mountpoints which are
not empty.  This segfault is now seen frequently by the CI after
the mount code was updated to directly call mount(2).

The original reason this logic was added is described in openzfs#8881.
Since then the systemd `zfs-share.target` has been updated to run
"After" the `zfs-mount.server` which should avoid this issue.

Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#9560
Closes openzfs#10364
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants