Review Accept.best_match and MIMEAccept._value_matches #20

mitsuhiko · 2011-01-07T01:00:13Z

There was a bug in there and the fix was the switching of two arguments. However it should be ensured that this was the only bug and that it now works as expected. Unfortunately it's slightly underdocumented there.

avalente · 2011-09-25T08:47:18Z

The Accept machinery is actually a little bit bugged:

The parser fails on media types like "text/html;level=2;q=0.4" (taken from the rfc rfc2616). I've written a simple working implementation (probably slow), the diff vs 0.7.1 follows
best_match is based only on the "quality" parameter, not the media type specificity, so sometimes it gives uncorrect results (see the following tests)

diff --git a/werkzeug/http.py b/werkzeug/http.py
index 32e0307..22bae4b 100644
--- a/werkzeug/http.py
+++ b/werkzeug/http.py
@@ -268,13 +268,25 @@ def parse_accept_header(value, cls=None):
         return cls(None)

     result = []
-    for match in _accept_re.finditer(value):
-        quality = match.group(2)
-        if not quality:
-            quality = 1
+    for item in value.split(","):
+        quality = 1.0
+        if ';' in item:
+            components = item.split(';')
+            mimetype = components[0].strip()
+            params = [x.strip().split("=") for x in components[1:]]
+            media_params = True
+            for key, val in params:
+                if key.strip() == "q":
+                    # end of media-range parameters
+                    media_params = False
+                    quality = max(min(float(val.strip()), 1), 0)
+
+                # accept-extension parameters are currently ignored
+                if media_params:
+                    mimetype += ";%s=%s" % (key, val)
         else:
-            quality = max(min(float(quality), 1), 0)
-        result.append((match.group(1), quality))
+            mimetype, params = item.strip(), []
+        result.append((mimetype, quality))
     return cls(result)

import werkzeug
from werkzeug.wrappers import BaseRequest, AcceptMixin
from werkzeug.test import create_environ

assert werkzeug.__version__.startswith("0.7"), werkzeug.__version__

class Request(BaseRequest, AcceptMixin):
    pass

def test1():
    # OK
    req = Request(create_environ(headers=[('accept', 'audio/*; q=0.2, audio/basic')]))
    assert req.accept_mimetypes.values() == ["audio/basic", "audio/*"]
    assert req.accept_mimetypes.best_match(['audio/xxx', 'audio/basic']) == "audio/basic"
    assert req.accept_mimetypes.best_match(['audio/basic', 'audio/xxx']) == "audio/basic"

def test2():
    req = Request(create_environ(headers=[('accept', 'audio/*, audio/basic')]))
    assert req.accept_mimetypes.values() == ["audio/basic", "audio/*"]  # OK

    assert req.accept_mimetypes.best_match(['audio/basic', 'audio/xxx']) == "audio/basic" # OK
    # audio/basic is more specific
    assert req.accept_mimetypes.best_match(['audio/xxx', 'audio/basic']) == "audio/basic" # FAIL

def test3():
    accept = "text/plain; q=0.5, text/html,text/x-dvi; q=0.8, text/x-c"
    req = Request(create_environ(headers=[('accept', accept)]))
    assert req.accept_mimetypes.best_match(['text/html', 'text/x-c']) == 'text/html' # OK

    # I think it should always choose the best match, which in this case is text/html that appears
    # first in the accept list. Probably it's just a personal taste.
    assert req.accept_mimetypes.best_match(['text/x-c', 'text/html']) == 'text/html' # FAIL

def test4():
    accept = "text/*, text/html, text/html;level=1, */*"
    req = Request(create_environ(headers=[('accept', accept)]))
    values = req.accept_mimetypes.values()
    assert values == ["text/html;level=1", "text/html", "text/*", "*/*"], values
    # text/html is more specific
    assert req.accept_mimetypes.best_match(['text/plain', 'text/html']) == 'text/html'

def test5():
    # very rare corner cases
    accept = "text/*;q=0.3, text/html;q=0.7, text/html;level=1,text/html;level=2;q=0.4, */*;q=0.5"
    req = Request(create_environ(headers=[('accept', accept)]))
    values = list(req.accept_mimetypes)

    assert req.accept_mimetypes.best_match(['image/jpeg', 'text/plain']) == "image/jpeg" # OK
    assert req.accept_mimetypes.best_match(['text/html', 'text/plain']) == "text/html" # OK
    # *.* has a quality of 0.5, while text/* 0.3, so image/jpeg should win
    assert req.accept_mimetypes.best_match(['text/plain', 'image/jpeg']) == "image/jpeg" # FAIL

anlutro · 2015-11-28T16:44:23Z

I'm not sure if this should be a separate issue or not, so I'll just post it here for now.

It seems to me that with a request that has the header Accept: */* (which is the default for curl, for example), best_match should return the default instead of the first element in the list.

davidism · 2019-09-07T18:37:34Z

#1507 and #1574 have improved how LanguageAccept and MIMEAccept matching works. Please open a new issue if there's something specific that's not working.

untitaker added the standards label Oct 25, 2014

davidism mentioned this issue Apr 25, 2019

Provide a fallback for LanguageAccept under Safari #1507

Merged

davidism added the save-for-sprint label May 30, 2019

davidism added this to the 1.0.0 milestone Jul 16, 2019

davidism closed this as completed Sep 7, 2019

github-actions bot locked as resolved and limited conversation to collaborators Nov 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review Accept.best_match and MIMEAccept._value_matches #20

Review Accept.best_match and MIMEAccept._value_matches #20

mitsuhiko commented Jan 7, 2011

avalente commented Sep 25, 2011

anlutro commented Nov 28, 2015

davidism commented Sep 7, 2019

Review Accept.best_match and MIMEAccept._value_matches #20

Review Accept.best_match and MIMEAccept._value_matches #20

Comments

mitsuhiko commented Jan 7, 2011

avalente commented Sep 25, 2011

anlutro commented Nov 28, 2015

davidism commented Sep 7, 2019