Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible copy_file_range issue with OpenZFS 2.2.3 and Kernel 6.8-rc5 #15930

Closed
satmandu opened this issue Feb 24, 2024 · 9 comments · Fixed by #15931
Closed

Possible copy_file_range issue with OpenZFS 2.2.3 and Kernel 6.8-rc5 #15930

satmandu opened this issue Feb 24, 2024 · 9 comments · Fixed by #15931
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@satmandu
Copy link
Contributor

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version Mantic/23.10
Kernel Version 6.8.0-rc5
Architecture x86_64
OpenZFS Version 2.2.3

Describe the problem you're observing

I'm getting a copy_file_range error when copying text files, but not binary files using a ruby command which uses copy_file_range.

I've opened an issue ruby/fileutils#118 , but this might be a zfs issue, as kernel changes ought not to affect userspace.

Describe how to reproduce the problem

This occurs in a docker container. Example container inside which I can reproduce this error, running on an x86_64 host on a 6.8-rc5 kernel and OpenZFS 2.2.3:

docker run --platform linux/amd64 --rm --net=host  -v $(pwd):/output -h $(hostname)-x86_64 -it satmandu/crewbuild:nocturne-x86_64.m90

Example commands inside that container that cause the issue:

chronos@mbp113-x86_64 /usr/local/lib/crew/packages $ cd /tmp
chronos@mbp113-x86_64 /tmp $ echo 'test' > test
chronos@mbp113-x86_64 /tmp $ irb
irb(main):001> require 'fileutils'
=> false
irb(main):002> FileUtils.install 'test', 'dira/'
/usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:2280:in `copy_stream': Operation not supported - copy_file_range (Errno::ENOTSUP)
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:2280:in `block (2 levels) in copy_file'
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:2279:in `open'
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:2279:in `block in copy_file'
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:2278:in `open'
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:2278:in `copy_file'
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:1078:in `copy_file'                 [
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:1629:in `block in install'
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:2469:in `block in fu_each_src_dest'
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:2485:in `fu_each_src_dest0'
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:2467:in `fu_each_src_dest'
        from /usr/local/lib64/ruby/gems/3.3.0/gems/fileutils-1.7.2/lib/fileutils.rb:1623:in `install'
        from (irb):2:in `<main>'
        from <internal:kernel>:187:in `loop'
        from /usr/local/lib64/ruby/gems/3.3.0/gems/irb-1.11.0/exe/irb:9:in `<top (required)>'
        from /usr/local/bin/irb:25:in `load'
        from /usr/local/bin/irb:25:in `<main>'
irb(main):003>  

This problem goes away if I boot into Kernel 6.7.5

Include any warning/errors/backtraces from the system logs

There are no other logs that show this issue... it doesn't occur if I copy a non-text-file.

@satmandu satmandu added the Type: Defect Incorrect behavior (e.g. crash, hang) label Feb 24, 2024
@satmandu
Copy link
Contributor Author

Looks like apfs-dkms on ubuntu had to make some changes with copy_file_range to get it to work on Kernel 6.8: https://bugs.launchpad.net/ubuntu/+source/linux-apfs-rw/+bug/2054682

@robn
Copy link
Member

robn commented Feb 24, 2024

Thanks for reporting this, and thanks for the pointer. Can confirm that torvalds/linux@705bcfcbde38 removed generic_copy_file_range. And indeed, OpenZFS configure notices:

checking whether fops->copy_file_range() is available... yes
checking whether generic_copy_file_range() is available... no
checking whether fops->remap_file_range() is available... yes
checking whether fops->clone_file_range() is available... no
checking whether fops->dedupe_file_range() is available... no

Without it, it assumes its on a pre-5.3 kernel, and returns ENOTSUPP instead of calling a fallback function, leading to what you're seeing.

The solution is pretty simple: also detect splice_copy_file_range, and use it if necessary. I should have time to do that in the next day or two.

@robn
Copy link
Member

robn commented Feb 24, 2024

I chose to take this distraction instead of cleaning my office 😆

@satmandu
Copy link
Contributor Author

#15931 does fix my issue when applied to and refreshed against OpenZFS 2.2.3.

@darkbasic
Copy link

You can hit this very same issue with proton as well:

ProtonLauncher[96033] INFO: empty STEAM_COMPAT_CLIENT_INSTALL_PATH set to /home/niko/.local/share/Steam
ProtonLauncher[96033] INFO: empty STEAM_COMPAT_DATA_PATH set to /home/niko/.local/share/proton-pfx/0
ProtonLauncher[96033] INFO: directory /home/niko/.local/share/proton-pfx/0 created
ProtonLauncher[96033] INFO: empty DXVK_STATE_CACHE_PATH set to /home/niko/.cache/dxvk-cache-pool
ProtonLauncher[96033] INFO: directory /home/niko/.cache/dxvk-cache-pool created
Proton: Upgrading prefix from None to GE-Proton9-1 (/home/niko/.local/share/proton-pfx/0/)
Traceback (most recent call last):
  File "/usr/share/steam/compatibilitytools.d/proton-ge-custom/proton", line 1575, in <module>
    g_session.init_session(sys.argv[1] != "runinprefix")
  File "/usr/share/steam/compatibilitytools.d/proton-ge-custom/proton", line 1458, in init_session
    g_compatdata.setup_prefix()
  File "/usr/share/steam/compatibilitytools.d/proton-ge-custom/proton", line 790, in setup_prefix
    self.copy_pfx()
  File "/usr/share/steam/compatibilitytools.d/proton-ge-custom/proton", line 674, in copy_pfx
    self.pfx_copy(src_file, dst_file)
  File "/usr/share/steam/compatibilitytools.d/proton-ge-custom/proton", line 653, in pfx_copy
    try_copyfile(src, dst)
  File "/usr/share/steam/compatibilitytools.d/proton-ge-custom/proton", line 253, in try_copyfile
    copyfile(src, dst)
  File "/usr/share/steam/compatibilitytools.d/proton-ge-custom/proton", line 233, in copyfile_reflink
    raise e
  File "/usr/share/steam/compatibilitytools.d/proton-ge-custom/proton", line 230, in copyfile_reflink
    bytes_to_copy -= copy_file_range(src.fileno(), dst.fileno(), bytes_to_copy)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 95] Operation not supported

@BananchickPasha
Copy link

Is it the same issue? Jetbrains ide stopped working after updating to 6.8:


        Caused by: java.io.IOException: Cannot save /home/banana/.config/JetBrains/Rider2024.1/options/window.state.xml.
Unable to create a backup file (window.state.xml~).
The file left unchanged.
                at com.intellij.util.io.SafeFileOutputStream.waitForBackup(SafeFileOutputStream.java:144)
                at com.intellij.util.io.SafeFileOutputStream.close(SafeFileOutputStream.java:124)
                at java.base/sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:439)
                at java.base/sun.nio.cs.StreamEncoder.lockedClose(StreamEncoder.java:237)
                at java.base/sun.nio.cs.StreamEncoder.close(StreamEncoder.java:222)
                at java.base/java.io.OutputStreamWriter.close(OutputStreamWriter.java:266)
                at java.base/java.io.BufferedWriter.implClose(BufferedWriter.java:398)
                at java.base/java.io.BufferedWriter.close(BufferedWriter.java:380)
                at kotlin.io.CloseableKt.closeFinally(Closeable.kt:56)
                at com.intellij.configurationStore.StringDataWriter.writeTo(DataWriter.kt:50)
                at com.intellij.configurationStore.DataWriter.writeTo$default(DataWriter.kt:17)
                at com.intellij.configurationStore.DataWriter.writeTo(DataWriter.kt:28)
                at com.intellij.configurationStore.FileBasedStorageKt.writeFile(FileBasedStorage.kt:339)
                ... 25 more
        Caused by: java.io.IOException: Operation not supported
                at java.base/sun.nio.fs.LinuxNativeDispatcher.directCopy0(Native Method)
                at java.base/sun.nio.fs.LinuxFileSystem.directCopy(LinuxFileSystem.java:159)
                at java.base/sun.nio.fs.UnixFileSystem.copyFile(UnixFileSystem.java:682)
                at java.base/sun.nio.fs.UnixFileSystem.copy(UnixFileSystem.java:1060)
                at java.base/sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:300)
                at java.base/java.nio.file.Files.copy(Files.java:1304)
                at com.intellij.util.io.SafeFileOutputStream.backup(SafeFileOutputStream.java:69)
                at com.intellij.util.concurrency.ContextCallable.call(ContextCallable.java:32)
                at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
                at com.intellij.util.concurrency.ContextRunnable.run(ContextRunnable.java:27)
                at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
                at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
                at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(Executors.java:735)
                at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(Executors.java:732)
                at java.base/java.security.AccessController.doPrivileged(AccessController.java:400)
                at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1.run(Executors.java:732)
                at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: Cannot save /home/banana/.config/JetBrains/Rider2024.1/options/usage.statistics.xml.
Unable to create a backup file (usage.statistics.xml~).
The file left unchanged.
        at com.intellij.util.io.SafeFileOutputStream.waitForBackup(SafeFileOutputStream.java:144)
        at com.intellij.util.io.SafeFileOutputStream.close(SafeFileOutputStream.java:124)
        at java.base/sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:439)
        at java.base/sun.nio.cs.StreamEncoder.lockedClose(StreamEncoder.java:237)
        at java.base/sun.nio.cs.StreamEncoder.close(StreamEncoder.java:222)
        at java.base/java.io.OutputStreamWriter.close(OutputStreamWriter.java:266)
        at java.base/java.io.BufferedWriter.implClose(BufferedWriter.java:398)
        at java.base/java.io.BufferedWriter.close(BufferedWriter.java:380)
        at kotlin.io.CloseableKt.closeFinally(Closeable.kt:56)
        at com.intellij.configurationStore.StringDataWriter.writeTo(DataWriter.kt:50)
        at com.intellij.configurationStore.DataWriter.writeTo$default(DataWriter.kt:17)
        at com.intellij.configurationStore.DataWriter.writeTo(DataWriter.kt:28)
        at com.intellij.configurationStore.FileBasedStorageKt.writeFile(FileBasedStorage.kt:339)
        ... 25 more
Caused by: java.io.IOException: Operation not supported
        at java.base/sun.nio.fs.LinuxNativeDispatcher.directCopy0(Native Method)
        at java.base/sun.nio.fs.LinuxFileSystem.directCopy(LinuxFileSystem.java:159)
        at java.base/sun.nio.fs.UnixFileSystem.copyFile(UnixFileSystem.java:682)
        at java.base/sun.nio.fs.UnixFileSystem.copy(UnixFileSystem.java:1060)
        at java.base/sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:300)
        at java.base/java.nio.file.Files.copy(Files.java:1304)
        at com.intellij.util.io.SafeFileOutputStream.backup(SafeFileOutputStream.java:69)
        at com.intellij.util.concurrency.ContextCallable.call(ContextCallable.java:32)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
        at com.intellij.util.concurrency.ContextRunnable.run(ContextRunnable.java:27)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(Executors.java:735)
        at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(Executors.java:732)
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:400)
        at java.base/java.util.concurrent.Executors$PrivilegedThreadFactory$1.run(Executors.java:732)
        at java.base/java.lang.Thread.run(Thread.java:1583)

@robn
Copy link
Member

robn commented Mar 17, 2024

Looks like it.

@DigitalDJ
Copy link

DigitalDJ commented Mar 18, 2024

I've just run up against this while trying 24.04 (noble) devel, and trying to setup Root on ZFS. 24.04 is to be released next month some time, and currently it's including zfs-2.2.2 with kernel 6.8.

Specifically, systemd-sysusers fails to create users with the error:

Failed to backup /etc/{group,passwd}: Operation not supported

Quick strace shows it is indeed copy_file_range throwing the error.

Bit of a pain, any dpkg script that runs systemd-sysusers ends up failing and leaves the package in a broken state :(

@robn
Copy link
Member

robn commented Mar 18, 2024

You should probably report that to Ubuntu then.

behlendorf pushed a commit that referenced this issue Mar 20, 2024
Linux 6.8 removes generic_copy_file_range(), which had been reduced to a
simple wrapper around splice_copy_file_range(). Detect that function
directly and use it if generic_ is not available.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15930 
Closes #15931
robn added a commit to robn/zfs that referenced this issue Mar 21, 2024
Linux 6.8 removes generic_copy_file_range(), which had been reduced to a
simple wrapper around splice_copy_file_range(). Detect that function
directly and use it if generic_ is not available.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#15930
Closes openzfs#15931
(cherry picked from commit ef08a4d)
behlendorf pushed a commit that referenced this issue Mar 21, 2024
Linux 6.8 removes generic_copy_file_range(), which had been reduced to a
simple wrapper around splice_copy_file_range(). Detect that function
directly and use it if generic_ is not available.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15930
Closes #15931
(cherry picked from commit ef08a4d)
stgraber pushed a commit to zabbly/zfs that referenced this issue Mar 28, 2024
Linux 6.8 removes generic_copy_file_range(), which had been reduced to a
simple wrapper around splice_copy_file_range(). Detect that function
directly and use it if generic_ is not available.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#15930
Closes openzfs#15931
(cherry picked from commit ef08a4d)
stgraber pushed a commit to zabbly/zfs that referenced this issue May 1, 2024
Linux 6.8 removes generic_copy_file_range(), which had been reduced to a
simple wrapper around splice_copy_file_range(). Detect that function
directly and use it if generic_ is not available.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#15930
Closes openzfs#15931
(cherry picked from commit ef08a4d)
lundman pushed a commit to openzfsonwindows/openzfs that referenced this issue Sep 4, 2024
Linux 6.8 removes generic_copy_file_range(), which had been reduced to a
simple wrapper around splice_copy_file_range(). Detect that function
directly and use it if generic_ is not available.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#15930 
Closes openzfs#15931
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants