Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MTL-1811 Incorrect FSopt Fails to Mount Partitions in SLES15SP4 #38

Merged
merged 1 commit into from
Jun 22, 2022

Conversation

rustydb
Copy link
Contributor

@rustydb rustydb commented Jun 22, 2022

Summary and Scope

Issue Type
  • Bugfix Pull Request

allocsize=13107 is invalid and has been ignored (perhaps discarded) by xfsprogs in SLES15SP2 and SLES15SP3, in SLES15SP4 this value prevents XFS from mounting.

SP3:

ncn-w003:~ # rpm -q --whatprovides /sbin/mkfs.xfs
xfsprogs-4.15.0-4.52.1.x86_64

SP4:

ncn-w001:~ # rpm -q --whatprovides /sbin/mkfs.xfs
xfsprogs-5.13.0-150400.1.9.x86_64

Somehow between xfsprogs-4.15 and 5.13 the handling of how allocsize is enforced has been made more strict. It is not known if our bad value was being honored at all in previous versions of xfsprogs, but we know in our current xfsprogs for SP4 that it causes a fatal error.

This PR changes the value to 131072, a valid value which is 2^17.

Prerequisites

  • I have included documentation in my PR (or it is not required)
  • I tested this on internal system (x) (if yes, please include results or a description of the test)

Idempotency

Risks and Mitigations

What is less risky, or more risky now - or if your mod fails is there a new risk?

@rustydb rustydb requested a review from a team as a code owner June 22, 2022 01:50
This option must be to a power-of-2, for some time the allocsize
optimization has not been to a power-of-2. Somehow this incorrect option
has gone unoticed and XFS mounts have been working fine in both
SLES15SP2 and SLES15SP3. However in SLES15SP4 our luck has ran out, this
incorrect value (incorrect because it isn't a power of 2) prevents XFS
from mounting.

It would seem that this value was incorrectly added a long time ago but
had gone unnoticed, most XFS optimization guides suggest 131072(k) and
our value is missing that trailing "2". 131072 is 2 to the power of 17
(in other words, 131072 log2 = 17). 13107 log2 is not an integer.
@rustydb
Copy link
Contributor Author

rustydb commented Jun 22, 2022

Confirmed to work, all k8s NCNs on redbull booted successfully with their partitions after baking this RPM into their images.

Copy link
Contributor

@heemstra heemstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine, but it'd be great to see comments here explaining why we need each of these options, and in particular why this size for this option.

@rustydb
Copy link
Contributor Author

rustydb commented Jun 22, 2022

Seems fine, but it'd be great to see comments here explaining why we need each of these options, and in particular why this size for this option.

It was general xfs optimization, I'll do more digging and add details to the JIRA.

@rustydb rustydb merged commit 1068a6b into main Jun 22, 2022
@rustydb rustydb deleted the MTL-1811 branch June 22, 2022 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants