Enforced data size in channels 2.1.7 prevent large file uploads #1240

agateblue · 2019-02-06T11:10:51Z

Channels was released a few days ago (kudos and thank you for the maintainance effort!) and we noticed a breaking change in our application.

Large file uploads that were working before are now failing, and understand it's related to this specific commit a1ecd5e by @Zarathustra2

I'm not discussing the rationale behind that commit, because enforcing the max size looks indeed better than before from a security/stability perspective. However, it is a breaking change for our app (and possibly others), because file uploads that were working before are now failing.

Since we were not setting DATA_UPLOAD_MAX_MEMORY_SIZE in our settings file, the default value (2.5Mio) applies, which is considerably lower than our average uploaded file size.

In theory, we could stick on channels==2.1.7 and set DATA_UPLOAD_MAX_MEMORY_SIZE to match our requirements, but AFAIU, it would apply to the whole payload, and not only file size. And we'd like to allow large files, but not excessively large POST data.

Also, as stated in Django's documentation about DATA_UPLOAD_MAX_MEMORY_SIZE:

The check is done when accessing request.body or request.POST and is calculated against the total request size excluding any file upload data.`

Based on that, I do think there is a bug in the way the check was implemented in channels, because it does not exclude file data.

I can try working on a PR if you agree with my suggested changes:

Document the breaking change in the changelog, so people know what to do when upgrading
Remove uploaded files when computed payload size

Let me know your thoughts and I'll start working on a fix :)

The text was updated successfully, but these errors were encountered:

carltongibson · 2019-02-06T14:03:03Z

Hi @EliotBerriot. OK, yes, from the description is sounds as if something is amiss. It sounds as if the change (plus your requirements) has unveiled a latent issue.

First step: can you put together a reproduce, in a test case or sample project so we can see exactly what's going on?

Zarathustra2 · 2019-02-06T15:22:48Z

Hi,

@carltongibson @EliotBerriot first of all, sorry that my commit has brought you trouble.

Unfortunately, my django knowledge is not good enough to fix it by myself but I would be happy to help you since I created the bug.

I think this is also related to #1171. So in a possible solution we could also move the check of DATA_UPLOAD_MAX_MEMORY_SIZE to the read method and provide lazy evalution as it is originaly implemented in Django.

carltongibson · 2019-02-06T20:13:41Z

Yep. That makes sense. @Zarathustra2 this isn’t you fault. 🙂 The more developed handling just isn’t there yet. Very happy to see input. (That test case would still be where I’d start...)

agateblue · 2019-02-07T12:45:57Z

@carltongibson @EliotBerriot first of all, sorry that my commit has brought you trouble.

Please don't be, things break and it's nobody's fault :)

The more developed handling just isn’t there yet. Very happy to see input. (That test case would still be where I’d start...)

I'm not sure if I'll find my way around the fix, but I feel confident enough to write a test case, and I'll be working on that this afternoon!

carltongibson · 2019-02-07T12:56:21Z

Awesome. Thanks @EliotBerriot! (It’s much easier from there, since there’s something to play with.)

…t requests

jpic · 2019-02-07T14:42:40Z

@EliotBerriot did you also try the following suggestion, otherwise can you provide any refutation perhaps ?

we could also move the check of DATA_UPLOAD_MAX_MEMORY_SIZE to the read method and provide lazy evalution as it is originaly implemented in Django.

agateblue · 2019-02-07T14:50:31Z

@jpic I didn't, the concrete steps to do that were not really clear for me. Also, I wanted to avoid touching to code I did not understand (this is my first contact with the channels codebase).

Since the read() method is currently a member of django's own http.HttpRequest class, it would also mean either reimplementing it or overriding it and I'm not sure how it would fix the problem. The current implementation fails fast, which is probably better, the real issue IMHO is that it is not compatible with what Django actually does in regard to file handling.

jpic · 2019-02-07T15:32:41Z

the concrete steps to do that were not really clear for me

Same here, and here's what I figured while searching, you're gonna love this, or i'm completely lost 😂

Turns out that:

your tests are valid,
they also pass without the check raise in the constructor introduced by a1ecd5e,
because the test does not catch the raise that comes from the constructor,
see for yourself, after removing the check channels have in the constructor, and removing the pytest.raises:

================================== FAILURES ===================================
_____ RequestTests.test_size_check_ignore_files_but_honor_other_post_data ______

self = <tests.test_http.RequestTests testMethod=test_size_check_ignore_files_but_honor_other_post_data>

    def test_size_check_ignore_files_but_honor_other_post_data(self):
        ....
        with override_settings(DATA_UPLOAD_MAX_MEMORY_SIZE=1):
>           request.POST

tests/test_http.py:233: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
channels/http.py:139: in _get_post
    self._load_post_and_files()
.tox/py37-dj21/lib/python3.7/site-packages/django/http/request.py:310: in _load_post_and_files
    self._post, self._files = self.parse_file_upload(self.META, data)
.tox/py37-dj21/lib/python3.7/site-packages/django/http/request.py:269: in parse_file_upload
    return parser.parse()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <django.http.multipartparser.MultiPartParser object at 0x7fd0e1a38358>

    def parse(self):
         ...
    
                    # Avoid reading more than DATA_UPLOAD_MAX_MEMORY_SIZE.
                    if settings.DATA_UPLOAD_MAX_MEMORY_SIZE is not None:
                        read_size = settings.DATA_UPLOAD_MAX_MEMORY_SIZE - num_bytes_read
    
                    # This is a post field, we can just set it in the post
                    if transfer_encoding == 'base64':
                        raw_data = field_stream.read(size=read_size)
                        num_bytes_read += len(raw_data)
                        try:
                            data = base64.b64decode(raw_data)
                        except binascii.Error:
                            data = raw_data
                    else:
                        data = field_stream.read(size=read_size)
                        num_bytes_read += len(data)
    
                    # Add two here to make the check consistent with the
                    # x-www-form-urlencoded check that includes '&='.
                    num_bytes_read += len(field_name) + 2
                    if (settings.DATA_UPLOAD_MAX_MEMORY_SIZE is not None and
                            num_bytes_read > settings.DATA_UPLOAD_MAX_MEMORY_SIZE):
>                       raise RequestDataTooBig('Request body exceeded settings.DATA_UPLOAD_MAX_MEMORY_SIZE.')
E                       django.core.exceptions.RequestDataTooBig: Request body exceeded settings.DATA_UPLOAD_MAX_MEMORY_SIZE.

@EliotBerriot, both your tests still pass after completely reverting a1ecd5e, see for yourself in https://github.com/jpic/channels/tree/testrevert

My recommendation would be to keep your security tests which we always need to have, but revert a1ecd5e at all.

carltongibson · 2019-02-07T15:44:00Z

Hey @EliotBerriot and @jpic. Thanks for your efforts here. Super. We need to be 100% clear here so all for the good!

This line here contrasts markedly with Django's:

channels/channels/http.py

Lines 131 to 132 in 8499a42

    
           self._body = body 
        
           assert isinstance(self._body, bytes), "Body is not bytes"

body is bytes. So have we already got it in memory, and do we need to bail on that to avoid potential memory exhaustion? (Current status: I need to look.)

jpic · 2019-02-07T15:57:51Z

I might not be able to provide further relevant clues without actually reading the WSGI and ASGI specs themselves. Meanwhile, there's something that I find interesting in the django code you have pointed out.


            # Limit the maximum request data size that will be handled in-memory.
            if (settings.DATA_UPLOAD_MAX_MEMORY_SIZE is not None and
                    int(self.META.get('CONTENT_LENGTH') or 0) > settings.DATA_UPLOAD_MAX_MEMORY_SIZE):
raise RequestDataTooBig('Request body exceeded settings.DATA_UPLOAD_MAX_MEMORY_SIZE.')

Does that mean that this check is completely "declarative" ? I mean, what happens if I add a long shellcode in the User-Agent header value along with a small Content-Length header ? Will that really prevent the User-Agent header value from being read in memory at all ? If so, how ? Thanks in advance for baring with me.

carltongibson · 2019-02-07T16:11:05Z

No problem. 🙂. The read() call is key. In WSGI land we get given a file like object. Which we can then pull into memory. (How the server in front of us handled that is not our problem.)

So yes, we have to look at what we're getting here, and work out where best to place our limits.

carltongibson · 2019-02-07T16:13:14Z

@jpic You knew this would be fun right! 😀

jpic · 2019-02-07T16:17:29Z

Maybe we should open a new issue about this, because Eliot's patch itself works as intended:

new tests seem good to have,
patch in runtime code does fix a regression,

Wether or not we want to reconsider a1ecd5e & connected code, should not block Eliot's PR because contains a bugfix that looks critical ?

After all, the patch fixes a regression and at the same time adds extra tests to prove that it's still fine, and secures channels in case Django changes the contract.

jpic · 2019-02-07T16:32:14Z

@carltongibson you bet 😂

About a1ecd5e, I can understand that it's trying to reproduce the Django code you have pointed out, which legitimates it after all. But it's not clear to me how boundary content is excluded from the check in Django, since Content-Length should include the sum of boundary data length.

Is it because they use self._body or just self instead of self.body in the case of multipart/form-data, which completely bypasses the check in the self.body property ?

It seems the code from Django in the body property (pointed out by @carltongibson above) is not applicable in the case of multipart/form-data requests. In that case, it will actually avoid calling self.body, call self.parse_file_upload which in turn calls MultiPartParser.parse, which does the check that Eliot's tests prove still working / covering channels code itself.

So, it turns out I understand the justification for the parent change a1ecd5e as well as Eliot's change as-is now.

Thanks y'all for sharing some of your insight ;)

carltongibson · 2019-02-12T14:15:36Z

So, just an update here.

I am looking into this. In particular, how we're loading the request body prior to instantiating the AsgiRequest:

channels/channels/http.py

Lines 208 to 214 in a660d81

    
           # See if the message has body, and if it's the end, launch into 
        
           # handling (and a synchronous subthread) 
        
           if "body" in message: 
        
               body += message["body"] 
        
           if not message.get("more_body", False): 
        
               await self.handle(body) 
        
               return

I want to look into whether we can wrap that somehow in a file-like, which would then allow us to leverage Django's established patterns here. (I don't know at all if that'll work yet but...) It would be nice to lazily pull the request body data into memory.

I'd rather Measure twice, cut once here. I'm not convinced that there's anything critical at stake, in the sense that we need to rush a half-thought patch out: adjusting DATA_UPLOAD_MAX_MEMORY_SIZE is a reasonable approach in the meantime, and I'm presuming there's always a reverse proxy in play (nginx etc) that can (also) impose sensible limits.

jpic · 2019-02-12T18:48:21Z

@carltongibson security became a non issue for me when:

django also bypasses this kind of security for this kind of requests
we're always supposed to have at least nginx if not a waf in front on django/channels
everytime i make a site that's serious about upload it turns out i have to change the default nginx settings for that purpose

So at the end of the day the pull request has no impact on actual security, and still fixes a regression that's pretty critical when supporting file uploads.

anx-ckreuzberger · 2019-03-08T06:50:58Z

Hi,

I've also hit this issue with Django 1.11, Django Rest Framework 3.9, Python 3.5 and Channels 2.1.7 in production (as in: DEBUG = False and using daphne with nginx) - I believe this issue doesn't come up when you're using a different run method for python (e.g., mod passenger).

I've downgraded to 2.1.6 for now, but I would love to see the check for DATA_UPLOAD_MAX_MEMORY_SIZE implemented properly, as in: https://docs.djangoproject.com/en/2.1/ref/settings/#data-upload-max-memory-size

The check is done when accessing request.body or request.POST and is calculated against the total request size excluding any file upload data.

I have my own File Upload Handlers which handle (large) file uploads - see https://docs.djangoproject.com/en/2.1/topics/http/file-uploads/#upload-handlers .

carltongibson · 2019-03-08T10:16:22Z

HI @anx-ckreuzberger. #1251 is in progress and will resolve this. Just in need of a small window to finish it off. I'll then roll a new version.

carltongibson · 2019-03-09T09:19:42Z

Hi both. In the meantime have you considered running a WSGI server to handle to Django HTTP requests, leaving ASGI just for sockets etc? This would get you Django's battle-hardened handling for e.g. file uploads without leaving you blocked by here.

jpic · 2019-03-12T19:30:28Z

I wouldn't worry about it, it's not hard to temporarily deploy a fork with the patches we want anyway ;)

However, i'm wondering if #1251 is ready-enough to ask that @anx-ckreuzberger and @EliotBerriot try out with their project and report results ?

carltongibson · 2019-03-13T08:54:42Z

Hey @jpic. Not quite. I need to rip-out the body property and adjust POST and FILES slightly. Once those are done, yes it should be about right. (I think it would work™ now but...)

Then I need to tidy-up and rename stuff and then we'll head to a release.

anx-ckreuzberger · 2019-03-18T06:54:24Z

In the meantime have you considered running a WSGI server to handle to Django HTTP requests, leaving ASGI just for sockets etc?
That is absolutely acceptable, but not my (our?) desired solution (think about a docker container or systemd service running daphne, now you need two docker containers or systemd services, one for daphne and one for gunicorn/uwsgi/...).

Downgrading to 2.1.6 is "a better fix" for me for now ;)

Long term, I also believe that ASGI is better than WSGI, but that's just my opinion.

carltongibson · 2019-05-08T12:03:10Z

Hi folks: if you can give #1251 a run against your app and report any issues that would be super.

agateblue · 2019-05-08T16:51:35Z

Thank you @carltongibson, I'll try that in the upcoming days and let you know if I see anything unusual :)

agateblue · 2019-05-09T08:39:48Z

@carltongibson I've tested your branch locally on my project, and everything seems to be working so far:

I don't see any regression with core features (e.g our API answers as usual)
Our file upload issue is fixed (I can successfully upload files bigger than 2.5mio)
Our test suite runs without issue (I'm not sure if it's relevant because we don't have E2E tests, so we may not be calling Channels logic at all)

Let me know if you need me to check anything else in particular!

carltongibson · 2019-05-11T20:06:37Z

Thanks for taking the time to give it a run @EliotBerriot!

pythonBerg · 2019-08-10T18:58:23Z

Hello. I am experiencing this same issue in 2.1.7 and see all the discussion on this thread and #1251. Can someone help me understand the bottom line. Is there a branch or code patch out there? Are there just a few lines I can edit myself. Thanks all...

carltongibson · 2019-08-11T06:20:21Z

Hi @pythonBerg. This is just waiting for a bit of bandwidth over the summer to finish off.

The PR in #1251 works. You can install that and give it a run and report back there. That would be super.

As it stands I’m not ready to just push it out to folks as maybe there’s hidden issues. It’s delicate. I want to test it first.

So, in its place I want to add a SpooledTemporaryFile version, that is much more clearly safe, and then offer the experimental version as an option for the brave to try. Once we’ve had a chance to see it in action, it can become the default. (In theory it’s the way to go...)

The hold up is just person-power. A set of releases here for Channels, Daphne and Channels redis is my goal for the summer. Any input from fellow humans here or elsewhere is greatly appreciated.

carltongibson · 2019-09-18T06:58:29Z

Fixed in #1352. Will be out later today.

Update channels as a bug was preventing development that got fixed in 2.3.0 (django/channels#1240)

The version constraint on `channels` is due to a bug that was fixed in 2.3.0 (see django/channels#1240).

agateblue mentioned this issue Feb 6, 2019

Installation of Mono docker on a Synology DS416play michaelmob/docker-funkwhale#15

Closed

carltongibson added the blocked/user-response label Feb 6, 2019

carltongibson removed the blocked/user-response label Feb 6, 2019

agateblue pushed a commit to agateblue/channels that referenced this issue Feb 7, 2019

Test case for django#1240

60d93ed

agateblue pushed a commit to agateblue/channels that referenced this issue Feb 7, 2019

Fix django#1240: ignore files when checking request size for multipar…

e93a5de

…t requests

agateblue pushed a commit to agateblue/channels that referenced this issue Feb 7, 2019

Fix django#1240: ignore files when checking request size for multipar…

3e448c2

…t requests

agateblue mentioned this issue Feb 7, 2019

Enforced data size in channels 2.1.7 prevent large file uploads #1242

Closed

2 tasks

jpic mentioned this issue Feb 7, 2019

Add a size limit for request bodies #1170

Closed

carltongibson mentioned this issue Feb 25, 2019

Rework HTTP body handling. #1251

Closed

carltongibson mentioned this issue Mar 29, 2019

Ability for AsgiHandler and AsgiRequest to process the request body in chunks. #1269

Closed

hozblok mentioned this issue Sep 16, 2019

Handle the HTTP request body as a spooled temp file. #1352

Merged

carltongibson closed this as completed Sep 18, 2019

ritlew added a commit to ritlew/django-hls-video that referenced this issue Dec 27, 2019

Update channels

2784781

Update channels as a bug was preventing development that got fixed in 2.3.0 (django/channels#1240)

ddabble added a commit to MAKENTNU/web that referenced this issue Oct 24, 2020

Updated packages in requirements.txt

6ae7c67

The version constraint on `channels` is due to a bug that was fixed in 2.3.0 (see django/channels#1240).

mykola-mokhnach mentioned this issue May 24, 2021

Customising the maximum size of request payload aio-libs/aiohttp#5704

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforced data size in channels 2.1.7 prevent large file uploads #1240

Enforced data size in channels 2.1.7 prevent large file uploads #1240

agateblue commented Feb 6, 2019

carltongibson commented Feb 6, 2019

Zarathustra2 commented Feb 6, 2019

carltongibson commented Feb 6, 2019

agateblue commented Feb 7, 2019

carltongibson commented Feb 7, 2019

jpic commented Feb 7, 2019 •

edited

Loading

agateblue commented Feb 7, 2019

jpic commented Feb 7, 2019 •

edited

Loading

carltongibson commented Feb 7, 2019 •

edited

Loading

jpic commented Feb 7, 2019 •

edited

Loading

carltongibson commented Feb 7, 2019

carltongibson commented Feb 7, 2019

jpic commented Feb 7, 2019 •

edited

Loading

jpic commented Feb 7, 2019 •

edited

Loading

carltongibson commented Feb 12, 2019

jpic commented Feb 12, 2019

anx-ckreuzberger commented Mar 8, 2019

carltongibson commented Mar 8, 2019

carltongibson commented Mar 9, 2019

jpic commented Mar 12, 2019 •

edited

Loading

carltongibson commented Mar 13, 2019

anx-ckreuzberger commented Mar 18, 2019

carltongibson commented May 8, 2019

agateblue commented May 8, 2019

agateblue commented May 9, 2019

carltongibson commented May 11, 2019

pythonBerg commented Aug 10, 2019

carltongibson commented Aug 11, 2019

carltongibson commented Sep 18, 2019

Enforced data size in channels 2.1.7 prevent large file uploads #1240

Enforced data size in channels 2.1.7 prevent large file uploads #1240

Comments

agateblue commented Feb 6, 2019

carltongibson commented Feb 6, 2019

Zarathustra2 commented Feb 6, 2019

carltongibson commented Feb 6, 2019

agateblue commented Feb 7, 2019

carltongibson commented Feb 7, 2019

jpic commented Feb 7, 2019 • edited Loading

agateblue commented Feb 7, 2019

jpic commented Feb 7, 2019 • edited Loading

carltongibson commented Feb 7, 2019 • edited Loading

jpic commented Feb 7, 2019 • edited Loading

carltongibson commented Feb 7, 2019

carltongibson commented Feb 7, 2019

jpic commented Feb 7, 2019 • edited Loading

jpic commented Feb 7, 2019 • edited Loading

carltongibson commented Feb 12, 2019

jpic commented Feb 12, 2019

anx-ckreuzberger commented Mar 8, 2019

carltongibson commented Mar 8, 2019

carltongibson commented Mar 9, 2019

jpic commented Mar 12, 2019 • edited Loading

carltongibson commented Mar 13, 2019

anx-ckreuzberger commented Mar 18, 2019

carltongibson commented May 8, 2019

agateblue commented May 8, 2019

agateblue commented May 9, 2019

carltongibson commented May 11, 2019

pythonBerg commented Aug 10, 2019

carltongibson commented Aug 11, 2019

carltongibson commented Sep 18, 2019

jpic commented Feb 7, 2019 •

edited

Loading

jpic commented Feb 7, 2019 •

edited

Loading

carltongibson commented Feb 7, 2019 •

edited

Loading

jpic commented Feb 7, 2019 •

edited

Loading

jpic commented Feb 7, 2019 •

edited

Loading

jpic commented Feb 7, 2019 •

edited

Loading

jpic commented Mar 12, 2019 •

edited

Loading