Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding dates as strings to url introduces extra characters #128

Open
UGuntupalli opened this issue Jul 2, 2020 · 4 comments
Open

Adding dates as strings to url introduces extra characters #128

UGuntupalli opened this issue Jul 2, 2020 · 4 comments

Comments

@UGuntupalli
Copy link

Hello,
Thanks for the great work that went into putting this package together. I just came across this package and have started using it. I ran into an issue and was hoping you can quickly clarify if I am missing something or if it is a bug ?
I am using python 3.7.5 and furl 2.1.0 and getting this weird behavior. Can you kindly help ?

   my_param_arg = dict()
   my_param_arg["start_date"] = '20200301T08:00-0000' 

   my_url = furl("http://www.test.com/")
   my_url.add(my_param_arg).url 

Expected Behavior:
'http://www.test.com/?start_date=20200301T08%3A00-0000&start_date=20200301T08

Actual Behavior:
'http://www.test.com/?start_date=20200301T08%3A00-0000&start_date=20200301T08%3A00-0000'

Can you please explain why the extra characters(%3A00-0000) are being added to the url ? I would appreciate if you can offer a workaround as well.

@gruns
Copy link
Owner

gruns commented Jul 2, 2020

Hey Uday!

I can't reproduce this issue with Python v3.7 and furl v2.1.0:

>>> from furl import furl
>>> d = dict()
>>> d["start_date"] = '20200301T08:00-0000'
>>> d
{'start_date': '20200301T08:00-0000'}

>>> f = furl("http://www.test.com/")
>>> f.add(d).url
'http://www.test.com/?start_date=20200301T08%3A00-0000'

%3A appears in the final URL because : needs to be URL encoded. %3A is the URL encoding of :, and the 00-0000 in %3A00-0000 is the tail portion of '20200301T08:00-0000' after the :.

@UGuntupalli
Copy link
Author

@gruns,
Just for my understanding, why should it be encoded ? Additionally is there a way to by-pass that behavior ? The reason I ask is if I try the exact same thing using requests package, it builds the url as expected without encoding the : in the date

@gruns
Copy link
Owner

gruns commented Jul 3, 2020

In short, : can be optionally encoded in the URL Query. From RFC 3986's grammar:

query       = *( pchar / "/" / "?" )
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

furl does. requests doesn't. For full details, see https://en.wikipedia.org/wiki/Percent-encoding and https://tools.ietf.org/html/rfc3986.

Is it important for your use case that : remain unencoded in the URL? If so, why?

@UGuntupalli
Copy link
Author

@gruns ,
Yes, I am trying to build a scrapper for CAISO API. (http://www.caiso.com/Documents/OASIS-InterfaceSpecification_v5_1_7Clean_Independent2019Release.pdf). The API url's do not expect that ":" be encoded and hence my ask. I found furl to be very clean and wanted to use it. for my application. So, if I understand correctly, I can't do this with furl - is that fair ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants