-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check the content type before returning the json value. #2177
Conversation
88f254c
to
a4abca2
Compare
Codecov Report
@@ Coverage Diff @@
## master #2177 +/- ##
==========================================
- Coverage 97.2% 97.15% -0.05%
==========================================
Files 39 39
Lines 7945 7956 +11
Branches 1378 1382 +4
==========================================
+ Hits 7723 7730 +7
- Misses 98 100 +2
- Partials 124 126 +2
Continue to review full report at Codecov.
|
a4abca2
to
bd5a95f
Compare
aiohttp/web_request.py
Outdated
You can also handle it manually by passing the | ||
`wrong_format` parameter. | ||
""" | ||
if if_check and self.content_type != 'application/json': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would I use, say, jsonapi if it will never match this hardcoded content-type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable. I will add it.
aiohttp/web_request.py
Outdated
@@ -510,8 +510,19 @@ def text(self): | |||
return bytes_body.decode(encoding) | |||
|
|||
@asyncio.coroutine | |||
def json(self, *, loads=json.loads): | |||
"""Return BODY as JSON.""" | |||
def json(self, *, loads=json.loads, wrong_format=None, if_check=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think signature is quite overloaded. I suggest just to allow to pass what exact content-type should be considered as valid. No match - throw an exception, nothing more. For BC, default value could be None until some release when we change it to 'application/json'. Thoughts?
aiohttp/web_request.py
Outdated
wrong_format_flag = False | ||
if isinstance(allowed_types, str): | ||
wrong_format_flag = allowed_types != self.content_type | ||
elif isinstance(allowed_types, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if allowed_types be a set or tuple or frosenset or whatever container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So will iterable make sense? I mean how about hasattr(alowed_types, '__iter__')
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be do like it was don't in ClientResponse.json()
? It's quite abstract from what kind of type was passed and the behaviour would be consistent between client and server.
aiohttp/web_request.py
Outdated
if if_check and self.content_type != 'application/json': | ||
if wrong_format is None: | ||
wrong_format_flag = False | ||
if isinstance(allowed_types, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I receive application/json
in legacy form when it will have a charset option defined explicitly like application/json;charset=utf-8
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just checking the type, subtype and suffix? And should json be decoded according to the charset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest just supporting a container (list
, tuple
, whatever) only and raising explicit TypeError on single string.
By default it should be None
for backward compatibility. That's sad but we cannot change it.
Application might have a parameter for enabling checking json types in the application by default though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just checking the type, subtype and suffix?
That's correct, but will make things too much complicated.
And should json be decoded according to the charset?
In modern RFC's it's mandatory should be UTF-8. However, there are lot of legacy around which forces charset via params. I don't thing we should care about non-UTF8 JSON, but if client asks about it explicitly, there is no reason to deny him, imho.
I suggest just supporting a container (list, tuple, whatever) only and raising explicit TypeError on single string.
It's good idea, but it conflicts with how client's json()
acts. I don't think we should have different behaviour of a similar things - that would annoy people all around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think we can use either Container
or str
, and None
is for wildcard (or should I use a '*'
as in add_route
, since allowed_types=None
looks weird for wildcard); otherwise a TypeError
is raised
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like global switches. You have argument and it's default. None
disables the check. For awhile it will be default to preserve BC. After some release it becomes application/json
in defaults and may be we could prohibit None
usage, but that's another story.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could I suggest a function(mapping a string to boolean) be passed as an argument to filter the Content-Type
and str
, None
are also accepted and nothing more is supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend to avoid callback style. This is the road to make aiohttp acts weird depending on what functions user passed around. There is pretty simple rules to find out if MIME type is json-like. In the rest cases user could override that behavior by explicit string value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that means the user won't want to behavior weirdly? Then I just need to check whether the type
is str
or tuple
or list
or set
and compare them directly, or that it is something that means json-like or wildcard. Also I need to derive the charset from Content-type
and pass it to loads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why he should? .json()
provides stable interface and guarantee about what it does. User may customize default expected content-type via string argument. Why there is need to do more?
If user need to do more complex content type check, it should make it at middleware level, not in .json()
call.
aiohttp/web_request.py
Outdated
raise HTTPNotAcceptable() | ||
elif callable(wrong_format): | ||
yield from wrong_format() | ||
elif callable(err_handler): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You expect callable, but below err_handler is a coroutine actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can only use module inspect
to decide whether an object is a coroutine. Should I decide whether it is a coroutine or not or even something else(says a string) and make calling correspondingly. I think more details should be given.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to not pass exc_handler
at all. No argument - no problems (:
aiohttp/web_request.py
Outdated
elif callable(wrong_format): | ||
yield from wrong_format() | ||
elif callable(err_handler): | ||
yield from err_handler() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why to pass here err_handler after all? You can wrap req.json()
call with try/except and do the rest on user code side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to have reduced the use of try/except so one could get rid of the fact that if you want to handle the error yourself, you need the try/except block i.e. one doesn't need to raise the error, catch it and handle it since catch an error is costly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to worry about try/except usage. If you do it, you explicitly sign about that you're aware about that things here could go wrong. Some custom argument that does the same makes it not so oblivious.
As about speed, you shouldn't worry about catch performance. I bet, your app business logic will be million times slower than that.
aiohttp/web_request.py
Outdated
then the request is rejected. | ||
You can also handle it manually by passing the | ||
`wrong_format` parameter. | ||
You can also handle it manually by passing the `err_handler` parameter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is need to note what will happens if content-type won't match.
aiohttp/web_request.py
Outdated
""" | ||
wrong_format_flag = False | ||
if isinstance(allowed_types, str): | ||
wrong_format_flag = allowed_types != self.content_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The self.content_type
may be application/json; charset=UTF-8
. Am I right ? So, use startswith
or handle that in a different way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. So how do you think of these codes:
content_type = self.content_type
(mime_type, subtype, suffix, parameters) = parse_mimetype(content_type)
content_type = mime_type + '/' + subtype
if suffix != '':
content_type += '+' + suffix
charset = parameters.get('charset', None)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems OK, I want to review this part in context of whole path.
aiohttp/web_request.py
Outdated
wrong_format_flag = False | ||
if isinstance(allowed_types, str): | ||
wrong_format_flag = allowed_types != self.content_type | ||
elif isinstance(allowed_types, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about tuple and set ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe all Containers should be supported, says
if isinstance(allowed_types, str):
wrong_format_flag = allowed_types != self.content_type
elif isinstance(allowed_types, collections.Container):
wrong_format_flag = self.content_type not in allowed_types
elif allowed_types is None: # do not check the content_type
wrong_format_flag = False
else: # for other type
raise TypeError("Unsupported container of `Content-Type`")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1
for any containers, though. Even list
.
aiohttp/web_request.py
Outdated
@@ -510,8 +510,26 @@ def text(self): | |||
return bytes_body.decode(encoding) | |||
|
|||
@asyncio.coroutine | |||
def json(self, *, loads=json.loads): | |||
"""Return BODY as JSON.""" | |||
def json(self, *, loads=json.loads, allowed_types=None, err_handler=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for allowed_types='application/json'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be breaking change. See previous discussions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we directly check if the mime type starts with 'application/'
and ends with '+json'
i.e. flask's way to deal with it? Hence no need to consider about how should allowed_type behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we stay with string only argument as @socketpair suggested, that's would be friendly default behaviour with almost no cost.
aiohttp/web_request.py
Outdated
if err_handler is None: | ||
raise HTTPNotAcceptable() | ||
elif callable(err_handler): | ||
yield from err_handler() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please do not check if it is callable. Instead, check return value of call. if it is awaitable -- await it. (see signal handlers in aiohttp)
- In current code, with
err_handler=42
-- error will be just ignored, which is a bug. - errors must not be handled as callbacks or so. There are exceptions for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed above, there's no need to give callbacks. Instead, only the 406 error is raised.
3b16898
to
5ae894d
Compare
aiohttp/web_request.py
Outdated
charset = parameters.get('charset', None) | ||
|
||
if isinstance(allowed_types, str): | ||
if allowed_types == 'json': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useless. Users must supply full contenttype like xxxx/yyyyy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means 'json-like'. View my code after that. I will then check whether the Content-Type
is 'application/json' or starts with 'application' and ends with '+json'. If it is a string and is not 'json', I will compare it with Content-Type
directly. This may be convenient i.e. one won't need to always specify the argument for different json format
aiohttp/web_request.py
Outdated
raise HTTPNotAcceptable() | ||
elif allowed_types != content_type: | ||
raise HTTPNotAcceptable() | ||
elif isinstance(allowed_types, set): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-2 for containers. Why they are intended for ? Which user-story (case) do you suggest ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, we discussed what if one sends 'application/json; charset=utf-8' and 'application/json' is hard-coded and now I strip the charset. My concern is that will there be something else the frontend will send fail to pass the examination, says the handler need to handle both application/json
and application/javascript
or some uncontrollable i.e. for compatibility of an old frontend framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep code of aiohttp simple and clear. Simple thing should be done in a simple way, and difficult ones should be possible.
If an specific application needs to handle such very obscure situation, it may control content-type by it's own before decoding of json. And also, I think, situation you have provided is just an imagination, not the real one. If no, please describe deeper.
aiohttp/web_request.py
Outdated
body = yield from self.text() | ||
return loads(body) | ||
return loads(body, encoding=charset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other arguments have the same meaning as in load(), except encoding which is ignored and deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Charset should go as additional argument of the function. And should be passed to .text() if defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the frontend sends a request specifying charset? And I think a json object should be parsed with loads, with encoding supported by python itself. Should .text()
decode something like "\\u4e16\\u754c\\u4f60\\u597d"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PLEASE look at the aiohttp sources before asking and before changing the code. .text()
will analyze conten-type header (regarding charset) before converting bytes to text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON-decoding is two-step process. First, it needs to convert from bytes to characters (symbols). Thi is done by UTF-8 (typically) decoder in .text()
. Second-step -- is to decode slash-escapes according to JSON-format. this is done using loads()
. So, encoding you are talking about is for first step only. As you can see, letter u
after every slash signifies that every item is UNICODE symbol (as opposed to some byte value). And so encoding is not applicable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I did a search and found it so I removed these codes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please remove containers
- Make Support of either None or full content type
- Remove encoding from loads
aiohttp/web_request.py
Outdated
""" Return BODY as JSON. If the `Content-Type` is not in allowed_types, | ||
then the request is rejected. | ||
""" | ||
content_type = self.content_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not required to parse self.content_type
if allowed_type
is None
aiohttp/web_request.py
Outdated
if suffix != '': | ||
content_type += '+' + suffix | ||
|
||
if isinstance(allowed_type, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much simplier:
if allowed_type is not None:
parse content type
if parsed_ct != allowed_type:
raise ...
without any else
aiohttp/web_request.py
Outdated
def json(self, *, loads=json.loads): | ||
"""Return BODY as JSON.""" | ||
def json(self, *, loads=json.loads, allowed_type=None): | ||
""" Return BODY as JSON. If the `Content-Type` is not in allowed_types, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is not in
is wrong now.
allowed_type, then the request is rejected. | ||
""" | ||
|
||
if allowed_type is None: # do not check the content_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for replacing this chain with only one condition like:
if allowed_type is not None:
...
content_type += '+' + suffix | ||
|
||
if allowed_type != content_type: | ||
raise HTTPNotAcceptable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, what do you guys think now? Should the message be passed as an argument? I think the error should be raised directly since that's exactly what I want for a JSON API in most cases and I think we've reached the agreement that for special requirements one should write a middleware or catch the exception as we discussed before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be wrong with just
def json(self, *, loads=json.loads, allowed_type='application/json'):
if allowed_type and self.content_type != allow_type:
raise HTTPNotAcceptable(text='invalid content-type: "{}" not "{}"'.format(self.content_type, allowed_type))
...
?
If users really want more sophisticated behaviour than this they can add their own logic, there's nothing very complicated here that couldn't be replicated in a custom coroutine.
also is from wikipedia:
That doesn't sound quite right. |
Fixed by another PR #3889 |
What do these changes do?
Check the content type before returning the json value.
Are there changes in behavior for the user?
When getting json requests one could directly
await request.json()
.If one'd like to ignore the
Content-Type
, he/she should specifyif_check
manually i.e.await request.json(if_check=False)
.Related issue number
referencing #2174
Checklist
CONTRIBUTORS.txt
changes
folder2174.bugfix
for example (588.bug)issue_id
change it to the pr id after creating the pr.feature
: Signifying a new feature..bugfix
: Signifying a bug fix..doc
: Signifying a documentation improvement..removal
: Signifying a deprecation or removal of public API..misc
: A ticket has been closed, but it is not of interest to users.