-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multipath-tools 0.10.1 #17
base: master
Are you sure you want to change the base?
Conversation
The @v1 releases are deprecated. https://github.blog/changelog/2024-02-13-deprecation-notice-v1-and-v2-of-the-artifact-actions/ Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
We can't use "container:" any more because upload-artifact@v4 doesn't work on Debian Jessie. Signed-off-by: Martin Wilck <mwilck@suse.com>
Running "make" in a container and "make clean" outside doesn't work (access right issues). So just use separate jobs. Signed-off-by: Martin Wilck <mwilck@suse.com>
Avoid "text file busy" error on GitHub. dmevents-test: Text file busy Makefile:74: recipe for target 'dmevents.out' failed Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
The ABI should never change on a stable branch. This workflow asserts that. Signed-off-by: Martin Wilck <mwilck@suse.com>
dm_get_maps() traverses the entire list of dm maps. We shouldn't give up just because probing a single map failed. Fixes: bf3a4ad ("libmultipath: simplify dm_get_maps()") Signed-off-by: Martin Wilck <mwilck@suse.com>
We import DM_COLDPLUG_SUSPENDED in all code flows below mpath_coldplug_end. Clarify this in the code. Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
Since b22c273 ("11-dm-mpath.rules: Don't force activation while device is suspended"), we've handled the case where a device is suspended while an uevent is processed (e.g. because multipathd is reloading the map again at the same time). But we were missing the case where The device had never been initialized before. If .MPATH_DEVICE_READY_OLD was empty, we'd jump to scan_import without setting MPATH_DEVICE_READY to 0. This can cause a device not to be fully activated at boot time, because in follow-up uevents we'd assume that the device had already been set up. Treat the case in which an uevent is processed for a previously not fully set-up, suspended device like other situations where we set MPATH_DEVICE_READY to 0. Fixes: b22c273 ("11-dm-mpath.rules: Don't force activation while device is suspended") Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
Our code is always setting MPATH_UNCHANGED and DM_ACTIVATION in pairs. While DM_ACTIVATION is a global DM property, MPATH_UNCHANGED is owned by us. Just set MPATH_UNCHANGED, and adapt DM_ACTIVATION when necessary just in one place. Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
We set .DM_NOSCAN above where we set DM_UDEV_DISABLE_OTHER_RULES_FLAG, too. If the latter isn't set, .DM_NOSCAN can't be set. Skip the redundant test. Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
When multipath reloads a device or fails or restores a path, the udev rules disable LVM scanning, but since .DM_NOSCAN isn't set, blkid is still run on the device. When multipath devices that are set to queue_if_no_path lose all their paths at close to the same time, udev workers can hang trying to run blkid. The blkid results shouldn't change when multipathd is adding, removing, failing or reinstating paths, aside from avoiding hanging udev processes, we're skipping unnecessary work. Hence, set .DM_NOSCAN if MPATH_UNCHANGED is set, to avoid blkid from being called in 13-dm.rules. Suggested-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
If pp->dev_loss is DEV_LOSS_TMO_UNSET and min_dev_loss is 0 (which is the case if no_path_retry is NO_PATH_RETRY_FAIL or NO_PATH_RETRY_UNDEF), we will set pp->dev_loss to 0, which is wrong. Fix it. Fixes: 058b5f5 ("libmultipath: fix dev_loss_tmo even if not set in configuration") Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
If reload_and_sync_map() removes the multipath device, deferred_failback_tick() needs to decrement the counter so that it doesn't skip the following device. Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Reported by coverity: "i--" may cause an underflow, which will again cause an overflow when the loop continues. Use a signed int for loops like this to make coverity happy. Signed-off-by: Martin Wilck <mwilck@suse.com>
Avoid the following error with clang 19: msort.c:268:27: error: cast from '__compar_fn_t' (aka 'int (*)(const void *, const void *)') to '__compar_d_fn_t' (aka 'int (*)(const void *, const void *, void *)') converts to incompatible function type [-Werror,-Wcast-function-type-mismatch] 268 | return msort_r (b, n, s, (__compar_d_fn_t)cmp, NULL); | ^~~~~~~~~~~~~~~~~~~~ Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Looks good to me. |
Actually, I do have some thoughts. First, I'm not sure we need to update News.md for the stable branch releases. You're certainly welcome to, and I can see how it would save distro maintainers some time if they wanted an overview of the changes. But I think simply bumping the version number for each pull request should be all that's really necessary, if you want to skip some extra work. Also, looking at this release makes it obvious that it's not always going to be easy to come up with a good "Fixes" commit trailer for some of the fixes. Without these, it won't be as easy as I hoped for people to be able to check if fix is applicable to a previous stable branch. Although I expect that if a fix is only applicable in the latest stable branch, then it should be pretty easy to come up |
Updating NEWS.md is a minor effort, because I just copy/paste the relevant parts from the major release. The fact that I did it here doesn't mean I am committing myself to do it always in the future, but I think it wouldn't cost me much time. About the "Fixes" trailers, I am not sure I understand your remark. Did you expect me to add "Fixes:" trailers to the bug fix commits that I'd identified? I opted to cherry-pick the 0.11.0 commits unchanged. IMO, if I add "Fixes:" here, I should do it in 0.11.0 as well (meaning that I need to rebase my "queue" branch, but I guess I'll have to do that anyway when I include the currently pending fixes). In the future, we should agree to add "Fixes:" trailers to bug fix patches sent to dm-devel in the first place, but we haven't done so in the past (at least these trailers haven't been mandatory, we've been adding them rather sporadically). Or were you referring to the other commits (like the github workflow commits) that don't fix actual bugs at all? |
No. I was expecting just what you did. We shouldn't be unnecessarily mucking with commit messages for the stable branch. I was just noticing that a number of the actual bug fixes didn't include "Fixes" trailers, and for some of them it's not even obvious what you would put there. They are (or at least could be) fixing issues that were introduced in multiple commits or issues that have existed long enough that the code has been refactored multiple times since the bug was introduced. The code may have originally been correct, and then you would need to track down the code change that caused it to stop being the correct thing to do. I was hoping that downstream users who wanted to stick with a stable branch that we are no longer maintaining could just look through all of subsequent stable branches and pull out the fixes applicable to their branch. The easiest way to do that would be to look at the Fixes trailers, and see if their branch includes the referenced commit. But like I said, it's not always obvious what you would put there. I agree that we should probably make a better effort to label bugfixes with a Fixes trailer when they are initially committed, but it can probably be a best effort kind of thing. If the bug has existed for years and the code has been tweaked and refactored multiple times since the initial bug was introduced, I don't think we should require git spelunking to find the initial breaking commit. But in that case, the bug has been around for years so finding the initial commit isn't that important. I suppose we could helpfully add something
To indicate that the bug was been there since at least 0.6.4. For commits that fix bugs introduced in multiple commits, we can probably just get by with a best effort to find the oldest breaking commit and mention that there are others.
Nope. Things like those don't need any trailers. |
@bmarzins discussion about future improvements and fixes tags aside, am I assuming correctly that you would ack this PR if I made it on opensvc?
Note that since 0.10.0, I have tried to point out in NEWS.md in which previous version a given bug had been introduced. I did not put the commit IDs there though because NEWS.md should also be readable by non-technical users. I haven't spent this effort for earlier releases, but still the "Bug fixes" section in NEWS.md should help downstream users. |
Yeah. My thoughts were just thoughts, not objections to the PR. |
c287753
to
f23bced
Compare
I've pushed the bugfix part of my systemd watchdog patch. I hope this is fine for you. The 0.11.0 solution is not suitable for the stable branch IMO. |
WATCHDOG_USEC may be set to 0, which means that the watchdog is disabled in systemd. Fixes: 9366cfb ("multipathd: Implement systemd watchdog integration") Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
f23bced
to
c435868
Compare
On systems with LVM volumes, "multipath -ll" will spit out lots of messages like libmp_mapinfo: map vg-lv0 has multiple targets which is irritating. Reduce the log level of these messages to 3, as they are harmless most of the time. This is a backport of e8949c2 ("libmultipath: reduce log level of libmp_mapinfo() messages") from the master branch. We can't apply exactly the same fix because the stable branch is missing 8c772d3 ("libmultipath: check DM UUID earlier in libmp_mapinfo__"). Signed-off-by: Martin Wilck <mwilck@suse.com>
de4efb7
to
b2642d2
Compare
_find_controllers() needs to free the udev device if it doesn't get added to a path. Otherwise it can leak memory whenever check_foreign() is called, causing multipathd's memory usage to continually grow. Fixes: 7b47762 ("libmultipath: nvme: fix path detection for kernel 4.16") Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Reviewed-by: Martin Wilck <mwilck@suse.com> (cherry picked from commit 5a3d334)
In function io_err_stat_handle_pathfail(), path->io_err_dis_reinstate_time is set to 0 to enqueue path to io error check as soon as possible. But multipathd can not do it within marginal_path_err_recheck_gap_time seconds after power-on, because curr_time is less than marginal_path_err_recheck_gap_time. To handle the early marginal path, we can enqueue path when io_err_dis_reinstate_time is 0. Signed-off-by: chenrenhui <chenrenhui1@huawei.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> Reviewed-by: Martin Wilck <mwilck@suse.com> > (cherry picked from commit a1e3cf2)
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Signed-off-by: Martin Wilck <mwilck@suse.com>
multipath-tools 0.10.1, 2024/11
This is the first bug fix release on the
stable-0.10.y
branch. It containsbug fixes from 0.11.0, and some CI-related fixes.
Bug fixes
of device mapper devices, in particular devices with multiple DM targets.
The problem was introduced in 0.10.0.
Fixes #102.
activated during boot if a cold plug uevent is processed for a previously
not configured multipath map while this map was suspended. This problem existed
since 0.9.8.
no_path_retry fail
and no settingfor
dev_loss_tmo
might get thedev_loss_tmo
set to 0, causing thedevice to be deleted immediately in the event of a transport disruption.
This bug was introduced in 0.9.6.
(
failback
value > 0 inmultipath.conf
), some maps might fail back laterthan configured. The problem existed since 0.9.6.
WATCHDOG_USEC
environment variable had the value "0", which means that thewatchdog is simply disabled. This (minor) problem existed since 0.4.9.
0.7.8.
the io error check for a recently failed path to be delayed. This bug
existed since 0.7.4.