Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query strings are not checked for invalid code points #11

Closed
jiahao opened this issue Jun 30, 2014 · 2 comments
Closed

query strings are not checked for invalid code points #11

jiahao opened this issue Jun 30, 2014 · 2 comments

Comments

@jiahao
Copy link
Contributor

jiahao commented Jun 30, 2014

julia> get("http://nominatim.openstreetmap.org/search?format=json&q=México D.F.")
assertion failed: c < 0x80
while loading In[63], in expression starting on line 1
 in is_url_char at /Users/jiahao/.julia/v0.3/URIParser/src/parser.jl:1
 in parse_url at /Users/jiahao/.julia/v0.3/URIParser/src/parser.jl:267
 in get at /Users/jiahao/.julia/v0.3/Requests/src/Requests.jl:575

julia> response = get("http://nominatim.openstreetmap.org/search",
    query={"format"=>json, "q"=>"México D.F."})
Response(400 Bad Request, 17 Headers, 393 Bytes in Body)

julia> response.data
"<html><body><h1>Bad Request</h1><p>Nominatim has encountered an error with your request.</p><p><b>Details:</b> Illegal query string (not an UTF-8 string): M跩co D.F.</p><p>If you feel this error is incorrect feel free to report the bug in the <a href=\"http://trac.openstreetmap.org\">OSM bug database</a>. Please include the error message above and the URL you used.</p>\n</body></html>\n\r\n0\r\n\r\n"

Possibly related to #9?

@jiahao
Copy link
Contributor Author

jiahao commented Jul 1, 2014

The Requests.get() method does actually attempt to escape non-ASCII characters with percent encoding. In fact, by the time Requests.open_stream() gets called, the HTTP stream data looks like this:

(stream,render(req)) => (TcpSocket(open, 0 bytes waiting),"GET /search?format=json&q=M%e9xico%20D.F. HTTP/1.1\r\nHost: nominatim.openstreetmaps.org\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nUser-Agent: Requests.jl/0.0.0\r\n\r\n")

It looks like somewhere in the bowels of TcpSocket the escaped sequence %e9xi is somehow being mangled into '跩' == 0x8de9 (which most ironically, means "to swagger").

@malmaud
Copy link
Contributor

malmaud commented Oct 4, 2015

This was fixed at some point.

@malmaud malmaud closed this as completed Oct 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants