-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempted to advance past end of bvec iter #12041
Comments
the same just happened with up-to-date master (zfs-9999 & zfs-kmod-9999 ebuilds on Gentoo) and 5.12.4 based kernel on zfs root:
zfs-kmod-9999:
according to:
sounds pretty serious in terms of filesystem usage maybe raise severity of the issue ? |
no zfs filesystem encryption used, the only pool mounted was the root pool |
Doesn't happen on 5.11.21-based kernel, btw, no related error messages on dmesg output - so only on 5.12 (or beyond) it seems |
@risto42 you're trying to build ZFS 2.0.4 against an unsupported kernel version, btw:
https://github.com/openzfs/zfs/releases/tag/zfs-2.0.4 one might assume that 5.12 is supported according to the changes but if I recall correctly there were additional changes against 5.12 in master branch |
Normally there were some patches for 5.12 pulled in... |
thanks for the link ! for reference: are the patches so it seems additional changes are needed |
Also reported here (#11967 (comment) ff.) by several people, including myself. Seems harmless to me so far, but should still be fixed. |
Happening here as well with NFS4 server-side copy:
Nothing encrypted here either. 2.1.0_rc5 on Kernel 5.12.4. |
I have the same issue. It seems that wine triggers this consistently. Doesn't happen on 5.11
|
it's independent from wine usage I'm running zfs on my root partition and it was the only zfs pool/zpool mounted during bootup with 5.12.4 and still got it. You got several occurrences of that message in dmesg ? |
No, only once per boot. But I can use the array for hours (copy, delete, snapshots, scrub) wihout error and seconds after starting wine (actually its proton with specific big prefix with game) it occurs. I might have missed an occurence (I thought it didn't happen always ie. ran the same game and didn't notice this), but now it seems to be 100% chance. I'm not actively runnig 5.12 now because of this, only testing because I think in one instatnce after this and reboot I had data rollback. New data after triggering this was gone, but pool wasn't corrupted and scrub completed without errors, but after testing this situation again the new data was written. |
I'm seeing it fairly quickly after booting:
Not sure what triggers it yet, nor if there any other side-effects. |
It happened to me today as well. And when I check the journal I see it 21 times since 5. May. Looks like it happens to me almost once a day. I have not noticed any negative side effect yet. |
This is happening for me on the latest git tip, currently 0989d79, on 5.12, also from Further debugging would likely take instrumenting this code somehow (maybe bpftrace can do it these days) to examine the difference between what ZFS is asking for the advance and how much the bvec iter actually says it contains, and then determining what's happening to create that difference. The warning is a |
agreed, got the same assumption there is one commit directly mentioning iov_iter_advance() related to iter_read: 9ac535e Remove iov_iter_advance() from iter_read I wonder if it's potentially related to this message and behavior in any way |
I think that this is probably exactly the write side of 9ac535e aka pull #11378. I added instrumentation to
This constantly logs messages from the first side of the condition (my usual test is to build Go from source):
I haven't seen a message from the second case yet. A type 4 is I think this means that something in the code that |
I noticed this for the first time today and went back through the logs looking for it as well. One instance back on 19th - when I would have still been running 5.12.3 - and three in the last two days running on 5.12.4. This is consistent with the once per boot as I really only reboot for kernel upgrades and if I cross reference the journalctl errors with the boot times they match up pretty closely. Checking my laptop which runs the same kernel+zfs builds the same error has been occurring unnoticed there as well, starting back on the 8th at which point it would have been running 5.12.3 + 2.1.0-rc4. Distribution Name | Ubuntu |
linux 5.11 kernel is EOL already (5.11.22) so from a security standpoint it would be nice to be able to upgrade to 5.12 kernels with the knowledge and certainty that there's no data jeopardising issues lying somewhere (the zfs tests runs on master didn't show any issues so far, I assume - so it should be safe with a bit a weird lingering after-taste) |
Easy way to reliably trigger this:
|
It seems like this should have gone with openzfs#11378. Closes: openzfs#12041 Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
I wonder if the change from #12155 Remove iov_iter_advance() for iter_write would output "GOOD zpl_iter_write: wrote [...]" |
It seems like this should have gone with openzfs#11378. The additional iter advance is incorrect, as copy_from_iter() has already done the right thing. Closes: openzfs#12041 Suggested-by: @siebenmann Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
@rincebrain your patch seems to solve this for me. |
confirmed, no occurrence anymore of the error message/warning in kernel logs |
I appreciate the confirmation, but this absolutely was all @siebenmann's research, I just did the couple of line deletions he suggested. |
Thanks goes to all involved @rincebrain, @siebenmann and @kernelOfTruth. I have tested this on kernel 5.12.8 and today 5.13-rc4 and both seem to work fine without warnings. Also tried usual aplications, file operations and scrub. Everything went fine. I think this could be merged. |
The additional iter advance is incorrect, as copy_from_iter() has already done the right thing. This will result in the following warning being printed to the console as of the 5.12 kernel. Attempted to advance past end of bvec iter This change should have been included with #11378 when a similar change was made on the read side. Suggested-by: @siebenmann Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Issue #11378 Closes #12041 Closes #12155 (cherry picked from commit 3f81aba) Signed-off-by: Jonathon Fernyhough <jonathon@m2x.dev>
The additional iter advance is incorrect, as copy_from_iter() has already done the right thing. This will result in the following warning being printed to the console as of the 5.12 kernel. Attempted to advance past end of bvec iter This change should have been included with openzfs#11378 when a similar change was made on the read side. Suggested-by: @siebenmann Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Issue openzfs#11378 Closes openzfs#12041 Closes openzfs#12155 (cherry picked from commit 3f81aba) Signed-off-by: Jonathon Fernyhough <jonathon@m2x.dev>
For what it's worth, emerge --sync triggers this reliably for me :-) which is probably why some gentoo users noticed this Was there ever any risk of corruption or was it harmless? |
The additional iter advance is incorrect, as copy_from_iter() has already done the right thing. This will result in the following warning being printed to the console as of the 5.12 kernel. Attempted to advance past end of bvec iter This change should have been included with openzfs#11378 when a similar change was made on the read side. Suggested-by: @siebenmann Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Issue openzfs#11378 Closes openzfs#12041 Closes openzfs#12155
Happened twice today while copying large files.
|
You mean despite the fix being applied? |
which version of the kernel module are you using ? I haven't seen this so far - up to 5.12.8 |
The additional iter advance is incorrect, as copy_from_iter() has already done the right thing. This will result in the following warning being printed to the console as of the 5.12 kernel. Attempted to advance past end of bvec iter This change should have been included with openzfs#11378 when a similar change was made on the read side. Suggested-by: @siebenmann Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Issue openzfs#11378 Closes openzfs#12041 Closes openzfs#12155
Just got this with
|
The additional iter advance is incorrect, as copy_from_iter() has already done the right thing. This will result in the following warning being printed to the console as of the 5.12 kernel. Attempted to advance past end of bvec iter This change should have been included with openzfs#11378 when a similar change was made on the read side. Suggested-by: @siebenmann Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Issue openzfs#11378 Closes openzfs#12041 Closes openzfs#12155
The additional iter advance is incorrect, as copy_from_iter() has already done the right thing. This will result in the following warning being printed to the console as of the 5.12 kernel. Attempted to advance past end of bvec iter This change should have been included with #11378 when a similar change was made on the read side. Suggested-by: @siebenmann Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Issue #11378 Closes #12041 Closes #12155 (cherry picked from commit 3f81aba) Signed-off-by: Jonathon Fernyhough <jonathon@m2x.dev>
Could this have caused a ZFS-8000-8A? I had one file marked as corrupted. The devices in the pool all had 0 errors. Reading the entire file did not result in any IO errors and I believe the file was not corrupt at all (not certain on this). I removed the file, but the error state remained until I ran a scrub. I don't know what would have happened if I ran the scrub with the file still present. The only error I found in kernel logs is the one from this issue. I am not familiar enough with ZFS to know if the described situation can arise under normal conditions, but it does not make sense to me. |
System information
Type | Version/Name
--- | Linux version 5.12.3-arch1-1 (linux@archlinux) (gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.36.1) #1 SMP PREEMPT Wed, 12 May 2021 17:54:18 +0000
Distribution Name | Archlinux
Distribution Version |
Linux Kernel | 5.12.3
Architecture | x86_64
ZFS Version | 2.0.4-1
SPL Version | 2.0.4-1
Describe the problem you're observing
after updating now that zfs-dkms/zfs-utils 2.0.4-2 can build on 5.12.x,
I noticed the traceback below in my logs
Describe how to reproduce the problem
not sure
Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered: