Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing ABD struct allocation for FreeBSD #10431

Merged
merged 1 commit into from
Jun 16, 2020

Conversation

bwatkinson
Copy link
Contributor

In the event we are allocating a gang ABD in FreeBSD we are passsing 0
to abd_alloc_struct(); however, this led to an allocation of ABD scatter
with 0 chunks. This left the gang ABD allocation 24 bytes smaller than
it should have been.

Signed-off-by: Brian Atkinson batkinson@lanl.gov

Fix ABD allocation bug in FreeBSD for allocating a gang ABD.

Motivation and Context

Currently if a gang ABD is allocated in FreeBSD the allocation size (abd_size) was incorrect.

Description

When a gang ABD is allocated the size passed into abd_alloc_struct() is 0. This leads to an ABD allocation based on a 0 chunk count scatter ABD. This is incorrect and leads 24 unallocated bytes in the case of gang ABD's. We want the allocation to be based on the ABD struct size with the union accounted for for gang ABDs. FreeBSD was rounding this up to 32 (using a power of two), but Matt Macy was able to capture this issue with his async work. We now do an allocation in abd_struct_alloc() base on the MAX of the chunkcnt or the size of an ADB struct.

How Has This Been Tested?

Ran zloop.sh (30 mins) and ZTS tests.

FreeBSD 12.1 STABLE

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • I have run the ZFS Test Suite with this change applied.
  • All commit messages are properly formatted and contain Signed-off-by.

@mattmacy
Copy link
Contributor

mattmacy commented Jun 9, 2020

A bit of clarification. This issue is never hit on FreeBSD under normal circumstances because kernel malloc will round up to the nearest power of two for sub page allocations (and thus 192 bytes actually translated to a 256 byte allocation). There is a purify / ElectricFence like debugging facility called memguard which will put new allocations size offset from the end of the page and, not map the adjacent pages, and then unmap the page that the allocation was on when the object was freed. This means that with memguard a 192 byte allocation is at page boundary + 3904, so when abd gang initialization tried to initailize the list_t at the end of the abd_t it would touch beyond the end of the mapped page. It would be unsurprising if OSX didn't do a similar rounding up and thus hit this issue.

@lundman
Copy link
Contributor

lundman commented Jun 9, 2020

Sure did;

SPL: buffer freed to wrong cache
SPL: buffer was allocated from kmem_cache_64
SPL: caller attempting free to kmem_cache_56

openzfsonosx/openzfs@2305003

Thanks for looking into this, I will give this commit a test without mine.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Jun 9, 2020
@lundman
Copy link
Contributor

lundman commented Jun 10, 2020

I applied this with my abd_orig_size member, just to get some faster test result, here is the diff

abd patch

diff --git a/module/os/macos/zfs/abd_os.c b/module/os/macos/zfs/abd_os.c
index 7fc4b9ab2..9d321da74 100644
--- a/module/os/macos/zfs/abd_os.c
+++ b/module/os/macos/zfs/abd_os.c
@@ -191,11 +191,18 @@ abd_t *
 abd_alloc_struct(size_t size)
 {
        size_t chunkcnt = abd_chunkcnt_for_bytes(size);
-       size_t abd_size = offsetof(abd_t,
-           abd_u.abd_scatter.abd_chunks[chunkcnt]);
-       abd_t *abd = kmem_alloc(MAX(abd_size, sizeof(abd_t)), KM_PUSHPAGE);
+       /*
+        * In the event we are allocating a gang ABD, the size passed in
+        * will be 0. We must make sure to set abd_size to the size of an
+        * ABD struct as opposed to an ABD scatter with 0 chunks. The gang
+        * ABD struct allocation accounts for an additional 24 bytes as
+        * over a scatter ABD with 0 chunks.
+        */
+       size_t abd_size = MAX(sizeof (abd_t),
+               offsetof(abd_t, abd_u.abd_scatter.abd_chunks[chunkcnt]));
+       abd_t *abd = kmem_alloc(abd_size, KM_PUSHPAGE);
        ASSERT3P(abd, !=, NULL);
-       abd->abd_orig_size = MAX(abd_size, sizeof(abd_t));
+       abd->abd_orig_size = abd_size;
        list_link_init(&abd->abd_gang_link);
        mutex_init(&abd->abd_mtx, NULL, MUTEX_DEFAULT, NULL);
        ABDSTAT_INCR(abdstat_struct_size, abd_size);
@@ -207,11 +214,13 @@ void
 abd_free_struct(abd_t *abd)
 {
        size_t chunkcnt = abd_is_linear(abd) ? 0 : abd_scatter_chunkcnt(abd);
-       int size = offsetof(abd_t, abd_u.abd_scatter.abd_chunks[chunkcnt]);
+       int size = MAX(sizeof (abd_t),
+               offsetof(abd_t, abd_u.abd_scatter.abd_chunks[chunkcnt]));
        mutex_destroy(&abd->abd_mtx);
        ASSERT(!list_link_active(&abd->abd_gang_link));

-       kmem_free(abd, abd->abd_orig_size);
+       VERIFY3U(size, ==, abd->abd_orig_size);
+       kmem_free(abd, size);
        ABDSTAT_INCR(abdstat_struct_size, -size);
 }

Running zpool_upgrade tests, which triggers it every time, I get:

panic(cpu 0): VERIFY3(size == abd->abd_orig_size) failed (120 == 128)

So it does not fix everything for macOS.

@lundman
Copy link
Contributor

lundman commented Jun 10, 2020

I applied the same size math to abd_alloc_scatter_offset_chunkcnt() but did not change the outcome.

@lundman
Copy link
Contributor

lundman commented Jun 10, 2020

If it helps:

abd_alloc_scatter_offset_chunkcnt: abd 0xffffff90ca886080 chunkcnt 4 abd_size 128
abd_free_struct: abd 0xffffff90ca886080 ->abd_size 8192 chunkcnt 2 size 120
VERIFY3(size == abd->abd_orig_size) failed (120 == 128)

@bwatkinson
Copy link
Contributor Author

bwatkinson commented Jun 10, 2020

If it helps:

abd_alloc_scatter_offset_chunkcnt: abd 0xffffff90ca886080 chunkcnt 4 abd_size 128
abd_free_struct: abd 0xffffff90ca886080 ->abd_size 8192 chunkcnt 2 size 120
VERIFY3(size == abd->abd_orig_size) failed (120 == 128)

What is strange to be about this is somehow on OSX the ABD's abd_size seems to be changing for a scatter ABD... I was hoping this patch might help you out OSX as well. Is there any major differences in the OSX abd_get_offset_scatter() implementation? The interesting thing is the point of this patch is handle the event of abd_struct_alloc() being passed 0 for the size. It should not affect anything outside of gang/linear ABD allocations.

Also, I wound up pushing up a change to abd_free_struct() after you modified your code. I needed to make sure a gang ABD sets chunkcnt to 0.

In the event we are allocating a gang ABD in FreeBSD we are passsing 0
to abd_alloc_struct(); however, this led to an allocation of ABD scatter
with 0 chunks. This left the gang ABD allocation 24 bytes smaller than
it should have been.

Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
@lundman
Copy link
Contributor

lundman commented Jun 10, 2020

I copied over the freebsd abd_os.c - deleted the -SYSCTL_DECL lines, added the checks for abd_orig_size.

abd_get_offset_scatter() is in global abd.c and untouched by me.

From the logs I pasted, it seems that abd_alloc_scatter_offset_chunkcnt() is called with chunkcnt = 4 but the abd_free_struct() calculates size using chunkcnt 2.

@codecov
Copy link

codecov bot commented Jun 10, 2020

Codecov Report

Merging #10431 into master will increase coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #10431   +/-   ##
=======================================
  Coverage   79.54%   79.55%           
=======================================
  Files         391      391           
  Lines      123866   123866           
=======================================
+ Hits        98531    98543   +12     
+ Misses      25335    25323   -12     
Flag Coverage Δ
#kernel 79.95% <ø> (+0.05%) ⬆️
#user 65.93% <ø> (+0.22%) ⬆️
Impacted Files Coverage Δ
lib/libzutil/zutil_pool.c 38.63% <0.00%> (-54.55%) ⬇️
module/lua/lmem.c 83.33% <0.00%> (-4.17%) ⬇️
module/zfs/zio_compress.c 89.74% <0.00%> (-2.57%) ⬇️
module/icp/api/kcf_mac.c 36.57% <0.00%> (-2.29%) ⬇️
cmd/zed/agents/fmd_api.c 88.61% <0.00%> (-1.78%) ⬇️
module/zfs/zil.c 91.90% <0.00%> (-1.22%) ⬇️
cmd/zed/agents/zfs_diagnosis.c 77.55% <0.00%> (-1.17%) ⬇️
module/zcommon/zfs_uio.c 87.75% <0.00%> (-1.03%) ⬇️
cmd/zdb/zdb.c 81.84% <0.00%> (-0.86%) ⬇️
module/zfs/vdev_mirror.c 95.14% <0.00%> (-0.75%) ⬇️
... and 49 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7bcb7f0...e740557. Read the comment docs.

@bwatkinson
Copy link
Contributor Author

I copied over the freebsd abd_os.c - deleted the -SYSCTL_DECL lines, added the checks for abd_orig_size.

abd_get_offset_scatter() is in global abd.c and untouched by me.

From the logs I pasted, it seems that abd_alloc_scatter_offset_chunkcnt() is called with chunkcnt = 4 but the abd_free_struct() calculates size using chunkcnt 2.

So each of the abd_os.c files has their own definition of abd_get_offset_scatter() and this is not defined in the common abd.c file. I had to create abd_alloc_scatter_offset_cunkcnt() for FreeBSD due to the math for calculating chunkcnt in abd_get_offset_scatter(). My inclination is there might be an issue when calculating the chunkcnt in OSX in abd_get_offset_scatter()? This is definitely strange though.

@mattmacy
Copy link
Contributor

@bwatkinson maybe we can go ahead and fix the FreeBSD bug and file a PR for the OSX one?

@bwatkinson
Copy link
Contributor Author

@bwatkinson maybe we can go ahead and fix the FreeBSD bug and file a PR for the OSX one?

@mattmacy I agree. This fixes the FreeBSD issue as it was intended to do.

@lundman, it seems that the issue you are seeing in OSX has to do with scatter ABDs, so this PR is not related to that issue. We can create another PR for that, and I can assist you in tracing down the issue in OSX separate from this PR.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jun 16, 2020
@behlendorf behlendorf merged commit 0a03495 into openzfs:master Jun 16, 2020
@lundman
Copy link
Contributor

lundman commented Jun 16, 2020

That sounds great.

lundman referenced this pull request in openzfsonosx/openzfs Jun 19, 2020
In the event we are allocating a gang ABD in FreeBSD we are passing 0
to abd_alloc_struct(); however, this led to an allocation of ABD scatter
with 0 chunks. This left the gang ABD allocation 24 bytes smaller than
it should have been.

Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #10431
@bwatkinson bwatkinson deleted the abd_freebsd_alloc branch June 25, 2020 15:52
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
In the event we are allocating a gang ABD in FreeBSD we are passing 0
to abd_alloc_struct(); however, this led to an allocation of ABD scatter
with 0 chunks. This left the gang ABD allocation 24 bytes smaller than
it should have been.

Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes openzfs#10431
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
In the event we are allocating a gang ABD in FreeBSD we are passing 0
to abd_alloc_struct(); however, this led to an allocation of ABD scatter
with 0 chunks. This left the gang ABD allocation 24 bytes smaller than
it should have been.

Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes openzfs#10431
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants