gh-129005: Remove copy in `_pyio.FileIO.readall()` #129496

cmaloney · 2025-01-31T04:39:00Z

This aligns the memory usage between _pyio and _io. Both now use the same amount of memory when reading a file.

On my linux dev box, drops ./python -m test -M8g -uall test_largefile -m test_large_read -v from ~3.3 to ~2.4 seconds.

./python -m test -M8g -uall test_largefile -m test_large_read -v
Checked 112 modules (34 built-in, 77 shared, 1 n/a on linux-x86_64, 0 disabled, 0 missing, 0 failed on import)
== CPython 3.14.0a4+ (heads/pyio_readall_match_mem-dirty:9d23eb4375b, Jan 30 2025, 20:33:11) [Clang 19.1.7 ]
== Linux-6.12.10-arch1-1-x86_64-with-glibc2.40 little-endian
== Python build: debug
== cwd: <workdir>/python/build/build/test_python_worker_486953æ
== CPU count: 32
== encodings: locale=UTF-8 FS=utf-8
== resources: all

Using random seed: 4286629456
0:00:00 load avg: 1.00 Run 1 test sequentially in a single process
0:00:00 load avg: 1.00 [1/1] test_largefile
test_large_read (test.test_largefile.CLargeFileTest.test_large_read) ... 
 ... expected peak memory use: 2.3G
 ... process data size: 2.3G
ok
test_large_read (test.test_largefile.PyLargeFileTest.test_large_read) ... 
 ... expected peak memory use: 2.3G
 ... process data size: 2.3G
 ... process data size: 2.3G
ok

----------------------------------------------------------------------
Ran 2 tests in 2.353s

OK

== Tests result: SUCCESS ==

1 test OK.

Total duration: 2.4 sec
Total tests: run=2 (filtered)
Total test files: run=1/1 (filtered)
Result: SUCCESS

Issue: Reduce copies when reading files in pyio, match behavior of _io #129005

This aligns the memory usage between _pyio and _io. Both now use the same amount of memory when reading a file.

cmaloney · 2025-02-02T02:08:58Z

I think this is the wrong way to approach this. Going to work on trying to make a zero-copy conversion from bytearray -> bytes instead, I think solves more generally for code, and creates less slight API differences that can break things.

pythongh-129005: Remove copy in _pyio.FileIO.readall()

a881837

This aligns the memory usage between _pyio and _io. Both now use the same amount of memory when reading a file.

bedevere-app bot added the awaiting review label Jan 31, 2025

bedevere-app bot mentioned this pull request Jan 31, 2025

Reduce copies when reading files in pyio, match behavior of _io #129005

Open

cmaloney marked this pull request as draft January 31, 2025 05:49

bedevere-app bot removed the awaiting review label Jan 31, 2025

cmaloney closed this Feb 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-129005: Remove copy in `_pyio.FileIO.readall()` #129496

gh-129005: Remove copy in `_pyio.FileIO.readall()` #129496

cmaloney commented Jan 31, 2025 •

edited by bedevere-app bot

Loading

cmaloney commented Feb 2, 2025

gh-129005: Remove copy in _pyio.FileIO.readall() #129496

gh-129005: Remove copy in _pyio.FileIO.readall() #129496

Conversation

cmaloney commented Jan 31, 2025 • edited by bedevere-app bot Loading

cmaloney commented Feb 2, 2025

gh-129005: Remove copy in `_pyio.FileIO.readall()` #129496

gh-129005: Remove copy in `_pyio.FileIO.readall()` #129496

cmaloney commented Jan 31, 2025 •

edited by bedevere-app bot

Loading