Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method "read_chunk" of "BodyPartReader" returns zero bytes before eof #1428

Closed
viotti opened this issue Nov 25, 2016 · 5 comments
Closed

Method "read_chunk" of "BodyPartReader" returns zero bytes before eof #1428

viotti opened this issue Nov 25, 2016 · 5 comments
Labels

Comments

@viotti
Copy link
Contributor

viotti commented Nov 25, 2016

Long story short

I've implemented a multipart file upload handler inspired on code from the docs. My code is truncating part's data. I believe the problem is in the method _read_chunk_from_stream, which is used by read_chunk of BodyPartReader. That method is returning a zero-length bytearray before the part's EOF. This is the pseudo-code.

reader = await request.multipart()
part = await reader.next()
arr = bytearray()

while True:
    chunk = await part.read_chunk()  # 8192 bytes by default.

    if not chunk:
        break

    arr.extend(chunk)

Expected behaviour

The loop ends when all the part's data has been read.

Actual behaviour

The loop ends before the part's data is exhausted, i.e., chunk becomes a zero-length bytearray prematurely.

Steps to reproduce

The code is part of a large web application so it's hard for me to give reproducible steps. But replacing the break condition to if not part._at_eof made the problem go away.

reader = await request.multipart()
part = await reader.next()
arr = bytearray()

while True:
    chunk = await part.read_chunk()  # 8192 bytes by default.

    if not part._at_eof:  # This fixed the problem.
        break

    arr.extend(chunk)

Your environment

Aiohttp 1.1.5
Python 3.5.1 from PSF
macOS Sierra 10.12.1

@viotti
Copy link
Contributor Author

viotti commented Nov 27, 2016

I think I've traced the problem back to lines 335 to 338 of multipart.py.

From revision 6edd825, it seems those lines deal with the case of a small multipart body, but I'm not sure about this. In lines 337 and 338 an empty bytes object is returned explicitly with self._at_eof == False, which is clearly a violation of the self.read protocol: return zero byte only at EOF.

Removing those lines did not affect the outcome of the test battery in my environment.

I can provide a PR if you like.

@asvetlov
Copy link
Member

@kxepal ?

@kxepal
Copy link
Member

kxepal commented Nov 27, 2016

@viotti
Nice found! Indeed _read_chunk_from_stream may return nothing in some cases fooling you about all content read.

While it looks as a good idea to switch to at_eof flag to break the loop, I think it still not good idea to be able to read nothing and keep looping on wasting CPU.

Feel free to submit a patch anyway! The _read_chunk_from_stream starts to be overly complicated thought /:

@asvetlov
Copy link
Member

Fixed by #1431

The fix is released as aiohttp 1.1.6

@lock
Copy link

lock bot commented Oct 29, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 29, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants