Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-33671: efficient zero-copy for shutil.copy* functions (Linux, OSX and Win) #7160

Merged
merged 114 commits into from
Jun 12, 2018
Merged
Show file tree
Hide file tree
Changes from 70 commits
Commits
Show all changes
114 commits
Select commit Hold shift + click to select a range
1a72c01
have shutil.copyfileobj use sendfile() if possible
giampaolo May 22, 2018
77c4bfa
refactoring: use ctx manager
giampaolo May 22, 2018
2afa04a
add test with non-regular file obj
giampaolo May 22, 2018
542cd17
emulate case where file size can't be determined
giampaolo May 22, 2018
3520c6c
reference _copyfileobj_sendfile directly
giampaolo May 22, 2018
050a722
add test for offset() at certain position
giampaolo May 22, 2018
c1fd38a
add test for empty file
giampaolo May 22, 2018
2ab6317
add test for non regular file dst
giampaolo May 22, 2018
dacc3b6
small refactoring
giampaolo May 22, 2018
29d5881
leave copyfileobj() alone in order to not introduce any incompatibility
giampaolo May 24, 2018
114c4de
minor refactoring
giampaolo May 24, 2018
501c0dd
remove old test
giampaolo May 24, 2018
41b4506
update docstring
giampaolo May 24, 2018
fdb0973
update docstring; rename exception class
giampaolo May 24, 2018
64d2bc5
detect platforms which only support file to socket zero copy
giampaolo May 24, 2018
3a3c8ef
don't run test on platforms where file-to-file zero copy is not suppo…
giampaolo May 24, 2018
7861737
use tempfiles
giampaolo May 24, 2018
f3eecfd
reset verbosity
giampaolo May 24, 2018
f67ce57
add test for smaller chunks
giampaolo May 24, 2018
d457254
add big file size test
giampaolo May 24, 2018
8eb211d
add comment
giampaolo May 24, 2018
a0fe703
update doc
giampaolo May 24, 2018
7296147
update whatsnew doc
giampaolo May 24, 2018
d0c3bba
update doc
giampaolo May 24, 2018
2cafd80
catch Exception
giampaolo May 24, 2018
bb2a75f
remove unused import
giampaolo May 24, 2018
e5025dc
add test case for error on second sendfile() call
giampaolo May 24, 2018
a36a534
turn docstring into comment
giampaolo May 24, 2018
e9da3fa
add one more test
giampaolo May 24, 2018
9fcc2e7
update comment
giampaolo May 24, 2018
4f32242
add Misc/NEWS entry
giampaolo May 24, 2018
24ad25a
get rid of COPY_BUFSIZE; it belongs to another PR
giampaolo May 25, 2018
24d20e6
update doc
giampaolo May 25, 2018
7b6e576
expose posix._fcopyfile() for OSX
giampaolo May 27, 2018
b82ddc9
Merge branch 'master' into shutil-osx-copyfile
giampaolo May 27, 2018
b62b61e
merge from linux branch
giampaolo May 27, 2018
34e9618
merge from linux branch
giampaolo May 27, 2018
6b20902
expose fcopyfile
giampaolo May 27, 2018
abf3ecb
arg clinic for the win implementation
giampaolo May 28, 2018
91e492c
convert path type to path_t
giampaolo May 28, 2018
e02c69d
expose CopyFileW
giampaolo May 28, 2018
73837e2
fix windows tests
giampaolo May 28, 2018
28be4c1
release GIL
giampaolo May 28, 2018
6c59adf
minor refactoring
giampaolo May 28, 2018
700629d
update doc
giampaolo May 28, 2018
077912e
update comment
giampaolo May 28, 2018
62c6568
update docstrings
giampaolo May 28, 2018
a40a755
rename functions
giampaolo May 28, 2018
7ba0085
rename test classes
giampaolo May 28, 2018
6c96d97
update doc
giampaolo May 28, 2018
80fbe6e
update doc
giampaolo May 28, 2018
fdf4bcb
update docstrings and comments
giampaolo May 28, 2018
185f130
avoid do import nt|posix modules if unnecessary
giampaolo May 28, 2018
c8c98ae
set nt|posix modules to None if not available
giampaolo May 28, 2018
17bb5e6
micro speedup
giampaolo May 28, 2018
d8b9bf9
update description
giampaolo May 28, 2018
b59ac57
add doc note
giampaolo May 28, 2018
8eefce7
use better wording in doc
giampaolo May 29, 2018
4fc8c6b
Merge branch 'master' into shutil-zero-copy
giampaolo May 30, 2018
3048e3d
rename function using 'fastcopy' prefix instead of 'zerocopy'
giampaolo May 30, 2018
11102e1
use :ref: in rst doc
giampaolo May 30, 2018
7545273
change wording in doc
giampaolo May 30, 2018
3261b74
add test to make sure sendfile() doesn't get called aymore in case it…
giampaolo May 30, 2018
51c476d
move CopyFileW in _winapi and actually expose CopyFileExW instead
giampaolo May 30, 2018
729dd23
fix line endings
giampaolo May 30, 2018
1823828
add tests for mode bits
giampaolo May 30, 2018
a9d6a07
add docstring
giampaolo May 30, 2018
e3ce917
remove test file mode class; let's keep it for later when Istart addr…
giampaolo May 30, 2018
f81a0ec
update doc to reflect new changes
giampaolo May 30, 2018
3e7475b
update doc
giampaolo May 30, 2018
05dd3cf
adjust tests on win
giampaolo May 31, 2018
9b54930
fix argument clinic error
giampaolo May 31, 2018
2bec11c
update doc
giampaolo May 31, 2018
c87648f
OSX: expose copyfile(3) instead of fcopyfile(3); also expose flags ar…
giampaolo May 31, 2018
941f740
osx / copyfile: use path_t instead of char
giampaolo May 31, 2018
4d28c12
do not set dst name in the OSError exception in order to remain consi…
giampaolo May 31, 2018
2149b8b
add same file test
giampaolo May 31, 2018
6a02a2a
add test for same file
giampaolo May 31, 2018
2287508
have osx copyfile() pre-emptively check if src and dst are the same, …
giampaolo May 31, 2018
b9da5d5
turn PermissionError into appropriate SameFileError
giampaolo May 31, 2018
c921f46
expose ERROR_SHARING_VIOLATION in order to raise more appropriate Sam…
giampaolo May 31, 2018
bb24490
honour follow_symlinks arg when using CopyFileEx
giampaolo May 31, 2018
fef8b32
update Misc/NEWS
giampaolo May 31, 2018
71be453
expose CreateDirectoryEx mock
giampaolo Jun 5, 2018
6035fe2
change C type
giampaolo Jun 6, 2018
8dc651e
CreateDirectoryExW actual implementation
giampaolo Jun 6, 2018
5d0eada
provide specific makedirs() implementation for win
giampaolo Jun 6, 2018
d67cdc5
Merge branch 'shutil-zero-copy-8' of https://github.com/giampaolo/cpy…
giampaolo Jun 6, 2018
f65c8ae
fix typo
giampaolo Jun 6, 2018
9c4508e
skeleton for SetNamedSecurityInfo
giampaolo Jun 6, 2018
bb1fee6
get security info for src path
giampaolo Jun 6, 2018
566898a
finally set security attrs
giampaolo Jun 6, 2018
f435053
add unit tests
giampaolo Jun 6, 2018
30c9a57
mimick os.makedirs() behavior and raise if dst dir exists
giampaolo Jun 6, 2018
33f362f
set 2 paths for OSError object
giampaolo Jun 6, 2018
e17e729
set 2 paths for OSError object
giampaolo Jun 6, 2018
bc46f75
expand windows test
giampaolo Jun 6, 2018
cabbc02
in case of exception on os.sendfile() set filename and filename2 exce…
giampaolo Jun 6, 2018
d22ee08
set 2 filenames (src, dst) for OSError in case copyfile() fails on OSX
giampaolo Jun 6, 2018
7a08203
update doc
giampaolo Jun 7, 2018
ab284e9
do not use CreateDirectoryEx() in copytree() if source dir is a symli…
giampaolo Jun 7, 2018
ac9479d
use bytearray() and readinto()
giampaolo Jun 7, 2018
fd77a7e
use memoryview() with bytearray()
giampaolo Jun 7, 2018
42a597e
refactoring + introduce a new _fastcopy_binfileobj() fun
giampaolo Jun 8, 2018
5008a8d
remove CopyFileEx and other C wrappers
giampaolo Jun 8, 2018
e89dd20
remove code related to CopyFileEx
giampaolo Jun 8, 2018
c0dc4b8
Recognize binary files in copyfileobj()
giampaolo Jun 8, 2018
29b9730
set 1MB copy bufsize on win; also add a global _COPY_BUFSIZE variable
giampaolo Jun 8, 2018
a1bed32
use ctx manager for memoryview()
giampaolo Jun 8, 2018
d9d27a7
update doc
giampaolo Jun 9, 2018
17bd78b
remove outdated doc
giampaolo Jun 9, 2018
b1d4917
remove last CopyFileEx remnants
giampaolo Jun 9, 2018
5ce94e4
OSX - use fcopyfile(3) instead of copyfile(3)
giampaolo Jun 12, 2018
07bcef5
update doc
giampaolo Jun 12, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 55 additions & 2 deletions Doc/library/shutil.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,9 @@ Directory and files operations
.. function:: copyfile(src, dst, *, follow_symlinks=True)

Copy the contents (no metadata) of the file named *src* to a file named
*dst* and return *dst*. *src* and *dst* are path names given as strings.
*dst* and return *dst* in the most efficient way possible.
*src* and *dst* are path names given as strings.

*dst* must be the complete target file name; look at :func:`shutil.copy`
for a copy that accepts a target directory path. If *src* and *dst*
specify the same file, :exc:`SameFileError` is raised.
Expand All @@ -74,6 +76,10 @@ Directory and files operations
Raise :exc:`SameFileError` instead of :exc:`Error`. Since the former is
a subclass of the latter, this change is backward compatible.

.. versionchanged:: 3.8
Platform-specific fast-copy syscalls are used internally in order to copy
the file more efficiently. See
:ref:`shutil-platform-dependent-efficient-copy-operations` section.

.. exception:: SameFileError

Expand Down Expand Up @@ -163,6 +169,11 @@ Directory and files operations
Added *follow_symlinks* argument.
Now returns path to the newly created file.

.. versionchanged:: 3.8
Platform-specific fast-copy syscalls are used internally in order to copy
the file more efficiently. See
:ref:`shutil-platform-dependent-efficient-copy-operations` section.

.. function:: copy2(src, dst, *, follow_symlinks=True)

Identical to :func:`~shutil.copy` except that :func:`copy2`
Expand All @@ -185,6 +196,11 @@ Directory and files operations
file system attributes too (currently Linux only).
Now returns path to the newly created file.

.. versionchanged:: 3.8
Platform-specific fast-copy syscalls are used internally in order to copy
the file more efficiently. See
:ref:`shutil-platform-dependent-efficient-copy-operations` section.

.. function:: ignore_patterns(\*patterns)

This factory function creates a function that can be used as a callable for
Expand Down Expand Up @@ -241,6 +257,10 @@ Directory and files operations
Added the *ignore_dangling_symlinks* argument to silent dangling symlinks
errors when *symlinks* is false.

.. versionchanged:: 3.8
Platform-specific fast-copy syscalls are used internally in order to copy
the file more efficiently. See
:ref:`shutil-platform-dependent-efficient-copy-operations` section.

.. function:: rmtree(path, ignore_errors=False, onerror=None)

Expand Down Expand Up @@ -314,6 +334,11 @@ Directory and files operations
.. versionchanged:: 3.5
Added the *copy_function* keyword argument.

.. versionchanged:: 3.8
Platform-specific fast-copy syscalls are used internally in order to copy
the file more efficiently. See
:ref:`shutil-platform-dependent-efficient-copy-operations` section.

.. function:: disk_usage(path)

Return disk usage statistics about the given path as a :term:`named tuple`
Expand Down Expand Up @@ -370,6 +395,29 @@ Directory and files operations
operation. For :func:`copytree`, the exception argument is a list of 3-tuples
(*srcname*, *dstname*, *exception*).

.. _shutil-platform-dependent-efficient-copy-operations:

Platform-dependent efficient copy operations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Starting from Python 3.8 all functions involving a file copy (:func:`copyfile`,
:func:`copy`, :func:`copy2`, :func:`copytree`, and :func:`move`) use
platform-specific "fast-copy" syscalls in order to copy the file more
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a difference between fast-copy and zero-copy sys calls? #learning

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I phrased this as "zero-copy" because that's what os.sendfile() actually does under the hoods. Then it turns out Windows does something different (see #7160 (comment)). As such it made more sense to stick with the generic "fast-copy" term. =)

efficiently (see :issue:`33671`).
"fast-copy" means that the copying operation occurs within the kernel, avoiding
the use of userspace buffers in Python as in "``outfd.write(infd.read())``".

On OSX `fcopyfile`_ is used to copy the file content (not metadata).
On Linux, Solaris and other POSIX platforms
where :func:`os.sendfile` supports copies between 2 regular file descriptors
:func:`os.sendfile` is used.
On Windows `CopyFile`_ is used by all copy functions except :func:`copyfile`.

If the fast-copy operation fails and no data was written in the destination
file then shutil will silently fallback on using less efficient
:func:`copyfileobj` function internally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given @eryksun's comments about how the relevant Windows API works, it may be better to just name the platform specific APIs here, without going into details on exactly how those APIs achieve their performance gains relative to a Python level copy operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Done.

.. versionadded:: 3.8

.. _shutil-copytree-example:

Expand Down Expand Up @@ -654,6 +702,11 @@ Querying the size of the output terminal

.. versionadded:: 3.3

.. _`CopyFile`:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa363851(v=vs.85).aspx

.. _`fcopyfile`:
http://www.manpagez.com/man/3/fcopyfile/

.. _`Other Environment Variables`:
http://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html#tag_002_003

12 changes: 11 additions & 1 deletion Doc/whatsnew/3.8.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,10 +90,20 @@ New Modules
Improved Modules
================


Optimizations
=============

* :func:`shutil.copyfile`, :func:`shutil.copy`, :func:`shutil.copy2`,
:func:`shutil.copytree` and :func:`shutil.move` use platform specific
"fast-copy" syscalls in order to copy the file more efficiently.
"fast-copy" means that the copying operation occurs within the kernel,
avoiding the use of userspace buffers in Python as in
"``outfd.write(infd.read())``".
The speedup for copying a 512MB file within the same partition is about +26%
on Linux, +50% on OSX and +48% on Windows. Also, much less CPU cycles are
consumed.
(Contributed by Giampaolo Rodola' in :issue:`25427`.)

* The default protocol in the :mod:`pickle` module is now Protocol 4,
first introduced in Python 3.4. It offers better performance and smaller
size compared to Protocol 3 available since Python 3.0.
Expand Down
119 changes: 117 additions & 2 deletions Lib/shutil.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,16 @@
except ImportError:
getgrnam = None

posix = nt = None
if os.name == 'posix':
import posix
elif os.name == 'nt':
import nt
import _winapi

_HAS_SENDFILE = posix and hasattr(os, "sendfile")
_HAS_FCOPYFILE = posix and hasattr(posix, "_fcopyfile")

__all__ = ["copyfileobj", "copyfile", "copymode", "copystat", "copy", "copy2",
"copytree", "move", "rmtree", "Error", "SpecialFileError",
"ExecError", "make_archive", "get_archive_formats",
Expand Down Expand Up @@ -72,6 +82,10 @@ class RegistryError(Exception):
"""Raised when a registry operation with the archiving
and unpacking registries fails"""

class _GiveupOnFastCopy(Exception):
"""Raised as a signal to fallback on using raw read()/write()
file copy when fast-copy functions fail to do so.
"""

def copyfileobj(fsrc, fdst, length=16*1024):
"""copy data from file-like object fsrc to file-like object fdst"""
Expand All @@ -81,6 +95,104 @@ def copyfileobj(fsrc, fdst, length=16*1024):
break
fdst.write(buf)

def _fastcopy_osx(fsrc, fdst):
"""Copy 2 regular mmap-like files by using high-performance
fcopyfile() syscall (OSX only).
"""
try:
infd = fsrc.fileno()
outfd = fdst.fileno()
except Exception as err:
raise _GiveupOnFastCopy(err) # not a regular file

try:
posix._fcopyfile(infd, outfd)
except OSError as err:
if err.errno in {errno.EINVAL, errno.ENOTSUP}:
raise _GiveupOnFastCopy(err)
else:
raise err from None

def _fastcopy_win(fsrc, fdst):
"""Copy 2 files by using high-performance CopyFileW (Windows only)."""
_winapi.CopyFileExW(fsrc, fdst, 0)

def _fastcopy_sendfile(fsrc, fdst):
"""Copy data from one regular mmap-like fd to another by using
high-performance sendfile() method.
This should work on Linux >= 2.6.33 and Solaris only.
"""
global _HAS_SENDFILE
try:
infd = fsrc.fileno()
outfd = fdst.fileno()
except Exception as err:
raise _GiveupOnFastCopy(err) # not a regular file

# Hopefully the whole file will be copied in a single call.
# sendfile() is called in a loop 'till EOF is reached (0 return)
# so a bufsize smaller or bigger than the actual file size
# should not make any difference, also in case the file content
# changes while being copied.
try:
blocksize = max(os.fstat(infd).st_size, 2 ** 23) # min 8MB
except Exception:
blocksize = 2 ** 27 # 128MB

offset = 0
while True:
try:
sent = os.sendfile(outfd, infd, offset, blocksize)
except OSError as err:
if err.errno == errno.ENOTSOCK:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path will also be hit consistently on FreeBSD and other systems that offer os.sendfile, but only with regular socket targets.

To avoid a performance regression on such platforms, it would be strongly preferred for there to be a build time configure check that told shutil that it shouldn't even try the optimised fast path, and should instead always go directly to copyfileobj.

Rather than a new flag in the OS module, this support could be indicated to the shutil module by exposing an os._sendfile_copy alias for os.sendfile only on platforms where sendfile supports regular file targets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path will also be hit consistently on FreeBSD and other systems that offer os.sendfile, but only with regular socket targets.

This would only be hit once though, as per _HAS_SENDFILE = False which gets globally set after first failed call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about in addition to setting the flag, we also do _copyfileobj2 = copyfileobj, to switch off the doomed-to-fail fastpath completely? That way after the first failed attempt, further attempts wouldn't even check the flag, since the "optimised" version would just be an alias for the regular implementation.

Copy link
Contributor Author

@giampaolo giampaolo May 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I would prefer not to change module objects at runtime like this, even if the function is private. It makes testing and mocking more difficult (because tests rely on the private functions) and the speedup of avoiding a global if VAR: runtime check is negligible considering that the copying operations are strictly IO bound. To say one, we would get more benefit by globalizing the various hasattr checks which are present in the module and still it's unlikely we'll get any noticeable speedup.

# sendfile() on this platform (probably Linux < 2.6.33)
# does not support copies between regular files (only
# sockets).
_HAS_SENDFILE = False
raise _GiveupOnFastCopy(err)

if err.errno == errno.ENOSPC: # filesystem is full
raise err from None

# Give up on first call and if no data was copied.
if offset == 0 and os.lseek(outfd, 0, os.SEEK_CUR) == 0:
raise _GiveupOnFastCopy(err)

raise err from None
else:
if sent == 0:
break # EOF
offset += sent

def _fastcopy_fileobj(fsrc, fdst):
"""Copy 2 regular mmap-like fds by using zero-copy sendfile(2)
(Linux) and fcopyfile(2) (OSX) syscalls.
In case of error fallback on using plain read()/write() if no
data was copied.
"""
# Note: copyfileobj() is left alone in order to not introduce any
# unexpected breakage. Possible risks by using zero-copy calls
# in copyfileobj() are:
# - fdst cannot be open in "a"(ppend) mode
# - fsrc and fdst may be open in "t"(ext) mode
# - fsrc may be a BufferedReader (which hides unread data in a buffer),
# GzipFile (which decompresses data), HTTPResponse (which decodes
# chunks).
# - possibly others
if _HAS_SENDFILE:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a configure check for _HAS_SENDFILE, the initial check for both it and _HAS_FCOPYFILE can be moved up to the module level, and define _copyfileobj_fast differently based on the results. If both flags are false, then _copyfileobj_fast can be aliased directly to copyfileobj. (I believe Mac OS X defines both operations, so it will be important to give _HAS_FCOPYFILE precedence when deciding which implementation of _copyfileobj_fast to use)

The dynamic check to handle the case where a Python built on a newer distro is run against an older kernel (and hence still gets ENOTSOCK at runtime) can then also be handled by setting _copyfileobj_fast = copyfileobj rather than by setting a flag that needs to be checked on every call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above.

try:
return _fastcopy_sendfile(fsrc, fdst)
except _GiveupOnFastCopy:
pass

if _HAS_FCOPYFILE:
try:
return _fastcopy_osx(fsrc, fdst)
except _GiveupOnFastCopy:
pass

return copyfileobj(fsrc, fdst)

def _samefile(src, dst):
# Macintosh, Unix.
if hasattr(os.path, 'samefile'):
Expand Down Expand Up @@ -117,9 +229,13 @@ def copyfile(src, dst, *, follow_symlinks=True):
if not follow_symlinks and os.path.islink(src):
os.symlink(os.readlink(src), dst)
else:
if os.name == 'nt':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot be used in shutil.copyfile. My previous comment explains why in detail. It can be used in shutil.copy2, but this makes separate copyfile and copystat calls inconsistent with copy2. I suggested a possible way to address this.

_fastcopy_win(src, dst)
return dst

with open(src, 'rb') as fsrc:
with open(dst, 'wb') as fdst:
copyfileobj(fsrc, fdst)
_fastcopy_fileobj(fsrc, fdst)
return dst

def copymode(src, dst, *, follow_symlinks=True):
Expand Down Expand Up @@ -1015,7 +1131,6 @@ def disk_usage(path):

elif os.name == 'nt':

import nt
__all__.append('disk_usage')
_ntuple_diskusage = collections.namedtuple('usage', 'total used free')

Expand Down
Loading