Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix accounting error for pending sync IO operations in zpool iostat #15478

Merged
merged 1 commit into from
Nov 7, 2023

Conversation

MigeljanImeri
Copy link
Contributor

@MigeljanImeri MigeljanImeri commented Nov 1, 2023

Motivation and Context

Currently, when using zpool iostat, the pending sync IO operations are not shown properly, rather either a 1 or a 0 is displayed. This is due to vdev_queue_class_length checking whether the list is empty, rather then checking the length of the the list. This issue was noticed when doing performance evaluations with the O_DIRECT PR (#10018), as all IO's are issued with sync priority through the VDEV queues.

Description

I added a counter variable to vdev_queue_t to keep track of the pending sync IO operations and changed vdev_queue_class_length to now reference this variable.

How Has This Been Tested?

The changes were tested by using fio to generate sync IO operations. Without these changes the pending sync IO operations showed either 1 or 0, while with the changes the pending sync IO operations showed as expected.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@amotin
Copy link
Member

amotin commented Nov 2, 2023

While it is kind of nice from point of being consistent with AVL-trees, lists are used in tons of places, there it would be a waste of 8 bytes per list, plus a dirty cache line for most of modifications, and all of it just for (yes, useful, but) cosmetic issue. I wonder how often that stat is required and whether it would be cheaper to count the elements (may be not, just a thought).

I could feel some better if in process we'd also remove list_size from field there, used on Linux and FreeBSD only for a single assertion (I wonder if it was used on Illumos by debugger). And/or if we use it for something else, may be we could balance ARC and dbuf cache evictions using it or something like that.

@bwatkinson
Copy link
Contributor

While it is kind of nice from point of being consistent with AVL-trees, lists are used in tons of places, there it would be a waste of 8 bytes per list, plus a dirty cache line for most of modifications, and all of it just for (yes, useful, but) cosmetic issue. I wonder how often that stat is required and whether it would be cheaper to count the elements (may be not, just a thought).

I could feel some better if in process we'd also remove list_size from field there, used on Linux and FreeBSD only for a single assertion (I wonder if it was used on Illumos by debugger). And/or if we use it for something else, may be we could balance ARC and dbuf cache evictions using it or something like that.

Part of the reason for this PR was doing investigations with small recordsize performance with the O_DIRECT PR (#10018). I agree that an unfortunate side effect of this would be inflating such a commonly used data structure by 8 bytes. It would make more sense if the length could also be leveraged in other parts of the code. Another alternative is this counter could just as easily be placed in the vdev_queue_t struct and handled entirely in vdev_queue.c. Do you see that as a preferred option?

@behlendorf
Copy link
Contributor

Another alternative is this counter could just as easily be placed in the vdev_queue_t struct and handled entirely in vdev_queue.c. Do you see that as a preferred option?

Placing the length in the vdev_queue_t seems preferable to me in this particular instance to keep the list_t as lean and lightweight as possible.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Nov 2, 2023
@MigeljanImeri
Copy link
Contributor Author

Another alternative is this counter could just as easily be placed in the vdev_queue_t struct and handled entirely in vdev_queue.c. Do you see that as a preferred option?

Placing the length in the vdev_queue_t seems preferable to me in this particular instance to keep the list_t as lean and lightweight as possible.

I have moved the counter to vdev_queue_t and have also changed the type to ulong_t.

@MigeljanImeri MigeljanImeri changed the title Fix accounting error for pending sync reads in zpool iostat Fix accounting error for pending sync IO operations in zpool iostat Nov 2, 2023
@amotin
Copy link
Member

amotin commented Nov 2, 2023

I'd be OK about it, just we do not need a counter for every priority level, only for those that use lists. We could fit it into current size would we do:

diff --git a/sys/contrib/openzfs/include/sys/vdev_impl.h b/sys/contrib/openzfs/include/sys/vdev_impl.h
index ad9dc3aefd8e..f2cc4d7e53f2 100644
--- a/sys/contrib/openzfs/include/sys/vdev_impl.h
+++ b/sys/contrib/openzfs/include/sys/vdev_impl.h
@@ -131,7 +131,10 @@ typedef const struct vdev_ops {
  * Virtual device properties
  */
 typedef union vdev_queue_class {
-       list_t          vqc_list;
+       struct {
+               ulong_t         vqc_list_numnodes;
+               list_t          vqc_list;
+       };
        avl_tree_t      vqc_tree;
 } vdev_queue_class_t;
 

, combined with list_size removal I mentioned.

@MigeljanImeri
Copy link
Contributor Author

I'd be OK about it, just we do not need a counter for every priority level, only for those that use lists. We could fit it into current size would we do:

diff --git a/sys/contrib/openzfs/include/sys/vdev_impl.h b/sys/contrib/openzfs/include/sys/vdev_impl.h
index ad9dc3aefd8e..f2cc4d7e53f2 100644
--- a/sys/contrib/openzfs/include/sys/vdev_impl.h
+++ b/sys/contrib/openzfs/include/sys/vdev_impl.h
@@ -131,7 +131,10 @@ typedef const struct vdev_ops {
  * Virtual device properties
  */
 typedef union vdev_queue_class {
-       list_t          vqc_list;
+       struct {
+               ulong_t         vqc_list_numnodes;
+               list_t          vqc_list;
+       };
        avl_tree_t      vqc_tree;
 } vdev_queue_class_t;
 

, combined with list_size removal I mentioned.

Okay, I see that makes sense. I have gone ahead and made those changes.

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but I didn't mean list_size removal in the same commit. It could be separate for readability.

Besides similar struct avl_tree we could possibly leave it in user-space for some libzfs ABI compatibility. I haven't checked, but I guess it may be used somewhere in public structures. Or we could remove them both at one time.

Currently vdev_queue_class_length is responsible for checking how long
the queue length is, however, it doesn't check the length when a list
is used, rather it just returns whether it is empty or not. To fix this
I added a counter variable to vdev_queue_class to keep track of the sync
IO ops, and changed vdev_queue_class_length to reference this variable
instead.

Signed-off-by: MigeljanImeri <ImeriMigel@gmail.com>
@MigeljanImeri
Copy link
Contributor Author

Okay, I see I've removed the list_size removal from the commit.

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will you open a separate PR for that, or you'd like me to?

@MigeljanImeri
Copy link
Contributor Author

Thanks. Will you open a separate PR for that, or you'd like me to?

I can open a separate PR, should I keep list_size in the user space list or remove it from both?

@amotin
Copy link
Member

amotin commented Nov 7, 2023

I can open a separate PR, should I keep list_size in the user space list or remove it from both?

I've tried to search ZFS libraries for lists use and found only implementation in libspl, nothing else. But just to be safer I think we could do it as two commits in one PR, one only for kernel and one for user-space. This way the first could be merged into 2.2, while the second remain in master till 2.3. @behlendorf what do you think?

@behlendorf
Copy link
Contributor

Agreed. Let's keep these changes in two separate commits to make sure we don't break the ABI. We might have code which embeds a list_t in some public structure. Although on initial grep I don't see anything obvious.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Nov 7, 2023
@behlendorf behlendorf merged commit 2a154b8 into openzfs:master Nov 7, 2023
24 of 25 checks passed
ixhamza pushed a commit to truenas/zfs that referenced this pull request Nov 20, 2023
Currently vdev_queue_class_length is responsible for checking how long
the queue length is, however, it doesn't check the length when a list
is used, rather it just returns whether it is empty or not. To fix this
I added a counter variable to vdev_queue_class to keep track of the sync
IO ops, and changed vdev_queue_class_length to reference this variable
instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: MigeljanImeri <ImeriMigel@gmail.com>
Closes openzfs#15478
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Dec 12, 2023
Currently vdev_queue_class_length is responsible for checking how long
the queue length is, however, it doesn't check the length when a list
is used, rather it just returns whether it is empty or not. To fix this
I added a counter variable to vdev_queue_class to keep track of the sync
IO ops, and changed vdev_queue_class_length to reference this variable
instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: MigeljanImeri <ImeriMigel@gmail.com>
Closes openzfs#15478
MigeljanImeri added a commit to MigeljanImeri/zfs that referenced this pull request Jan 18, 2024
Removed the list_size struct member as it was only used in a single assertion,
as mentioned in PR openzfs#15478.

Signed-off-by: MigeljanImeri <imerimigel@gmail.com>
MigeljanImeri added a commit to MigeljanImeri/zfs that referenced this pull request Jan 18, 2024
Removed the list_size struct member as it was only used in a single assertion,
as mentioned in PR openzfs#15478.

Signed-off-by: MigeljanImeri <imerimigel@gmail.com>
MigeljanImeri added a commit to MigeljanImeri/zfs that referenced this pull request Jan 24, 2024
Removed the list_size struct member as it was only used in a single
assertion, as mentioned in PR openzfs#15478.

Signed-off-by: MigeljanImeri <imerimigel@gmail.com>
MigeljanImeri added a commit to MigeljanImeri/zfs that referenced this pull request Jan 24, 2024
Removed the list_size struct member as it was only used in a single
assertion, as mentioned in PR openzfs#15478.

Signed-off-by: MigeljanImeri <imerimigel@gmail.com>
MigeljanImeri added a commit to MigeljanImeri/zfs that referenced this pull request Jan 25, 2024
Removed the list_size struct member as it was only used in a single
assertion, as mentioned in PR openzfs#15478.

Signed-off-by: MigeljanImeri <imerimigel@gmail.com>
MigeljanImeri added a commit to MigeljanImeri/zfs that referenced this pull request Jan 25, 2024
Removed the list_size struct member as it was only used in a single
assertion, as mentioned in PR openzfs#15478.

Signed-off-by: MigeljanImeri <imerimigel@gmail.com>
behlendorf pushed a commit that referenced this pull request Jan 26, 2024
Removed the list_size struct member as it was only used in a single
assertion, as mentioned in PR #15478.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: MigeljanImeri <imerimigel@gmail.com>
Closes #15812
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 13, 2024
Removed the list_size struct member as it was only used in a single
assertion, as mentioned in PR openzfs#15478.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: MigeljanImeri <imerimigel@gmail.com>
Closes openzfs#15812
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 13, 2024
Removed the list_size struct member as it was only used in a single
assertion, as mentioned in PR openzfs#15478.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: MigeljanImeri <imerimigel@gmail.com>
Closes openzfs#15812
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants