-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIGHMEM violation by kmap in zfs_uiomove_bvec_impl #15668
Comments
Nice find. At a glance, this looks serious and might explain some other bug reports. |
Thanks, though its really the OSS team who "caught it" with their most recent tightening up of memory access in the kernel. So far i haven't come up with "anything smarter" than ensuring that macro tests the sizes at least in debug builds so we can find out which callers are doing this. I dont know if there's any safer/smarter way than chunking those |
For some more context, the kmap_* API only maps a single page at a time, but the offsets being used for some accesses to those maps exceeded the page boundary. On x64 and others that don't use HIGHMEM, you simply get back the lowmem address, so as long as the memory is physically contiguous, things work even if it's wrong per the API. On i386 and other archs making use of HIGHMEM, it would instead be accessing some other kmap and use the wrong data for reading/writing. |
That explains why things are not horribly broken on amd64, despite this making things look horribly broken to me. |
I've been looking into this a bit today. Its not been easy to confirm that @sempervictus can you tell me more about this "hacked together in a few minutes fix"? Since I don't have any hardware that would naturally run into this (or I'd know about it!) and I don't have access to the grsec patches, any change I put in is going to be guesswork. I've got a small patch that adds asserts to catch uses of mapped areas that run past a single page, and they are trivally easy to trip, which suggests to me that they're not really representative of the problem. I'd be surprised if it was that easy, because there aren't widespread reports of corruption. Unless its actually "wrong" by the API but also basically never happens in practice. Any and all info appreciated. |
It's possible that if you convert the ZFS aliases for kmap_atomic / kunmap_atomic to kmap_local_page/kunmap_local and then enable CONFIG_DEBUG_KMAP_LOCAL, CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP, and CONFIG_DEBUG_HIGHMEM that the issue will then be reproducible with a vanilla kernel. The second option in particular will force the map similar to real highmem systems, so the lowmem alias can't mask the issue. BTW, the relevant documentation is in Documentation/mm/highmem.rst:
|
@bspengler-oss thanks, I'll give that a try sometime. Quick check looks like it all requires
Yeah, this sort of thing is why I said "hard to confirm but probably", because "page" is an overloaded term. But I believe you all, so its good enough for me at this point. |
System information
Describe the problem you're observing
Upstream Linux doesn't guard memory nearly as well as Grsecurity, "allowing"
kmap
ingbv_offset + skip
>PAGE_SIZE
under x86_64; this will likely crash under x86 however. The "hacked together in a few minutes fix" (by much better hackers) currently keeping my system alive changes thekmap
andmemcpy
s to usePAGE_MASK
magic to ensure we don't violate the boundary; but there should probably be a zfs-correct solution to this situation as we're occasionally performing very low level memory operations "illegally" in kernels without enforcement for the constraint which will eventually occur (and may have impacts on other architectures).Given that this bug was found in UIO memory management/movement code, i'm curious as to whether it poses any risk to data integrity if actually tripping a fault (and not crashing but losing an IO or segment thereof) under other runtime conditions since the read-back could similarly fail to
kmap
incorrectly.Describe how to reproduce the problem
kmap
constraint forPAGE_SIZE
d map targets (or write an assertion into thezfs_kmap*
macro to do so).send
/recv
on the same system with heavily fragmented ZVOLs (seems to be what tripped mine) or run a bunch of import/export passes for a pool containing such ZVOLs.Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered: