Uri::toString does not encode special characters #22

sakiss · 2020-03-10T19:05:29Z

Steps to reproduce the issue

In any layout file add the following code:

<?php
$uri = Uri::getInstance('index.php?option=com_content&var=#//:?&@\')');?>
<a href="<?php echo $uri;?>">some link</a>

Then check the link in the browser.

Expected result

The link contains special/reserved characters which are supposed to be encoded.
Otherwise the url is not functional.

Actual result

Non encoded reserved characters.
Even if we pass the characters encoded, those are decoded when the url is converted to string.

The text was updated successfully, but these errors were encountered:

mbabker · 2020-03-10T19:18:03Z

This sounds like a case where you would need to handle any required encoding to me after the URI class creates the string, I don't think this is something that you can address in a generally reusable abstraction class. Using the example within the HTML context, you're right in that it would be best if special characters were encoded. BUT, considering the URI class is used in other non-HTML contexts, assuming that the resulting URL should be encoded for a specific scenario isn't the best of ideas. That assumption could very well break the Framework's HTTP package as an example, which leads to other breakages in other systems.

So any change proposal needs to tread very carefully.

mbabker · 2020-03-10T19:19:09Z

#21 has a past attempt at dealing with encoding.

sakiss · 2020-03-10T19:31:49Z

Thanks for the feedback!

When the query contains reserved characters, those should not be encoded, no matter the usage?
https://tools.ietf.org/html/rfc3986#section-2.2

mbabker · 2020-03-10T19:54:00Z

To be honest, I don't know the RFCs well enough in that regard to say what the absolute "right" behavior is. The way the RFC reads is that encoding is only mandatory if it doesn't conflict with a character's other use in that segment of a URI, so & only requires encoding if you're trying to pass that character inside the value of something in a query string, but doesn't seem to require encoding in the path or fragment segments (and I'll say I'm either completely over-simplifying this or am completely wrong here, but like I said, I'm clueless on what the "right" thing is).

Maybe bounce things off of how Laminas or Guzzle implement their PSR-7 URI classes, and see if there's something in those implementations that can set a guideline on how to change things here.

sakiss · 2020-03-11T12:43:25Z

The way the RFC reads is that encoding is only mandatory if it doesn't conflict with a character's other use in that segment of a URI, so & only requires encoding if you're trying to pass that character inside the value of something in a query string.

This sounds reasonable and seems to be backed by the RFC3986.

but doesn't seem to require encoding in the path or fragment segments (and I'll say I'm either completely over-simplifying this or am completely wrong here, but like I said, I'm clueless on what the "right" thing is)

I think that the limitations apply in every component (except the host, the port and the user) and this seem to happen in the Guzzle:PSR7 class and the laminas as well.

Though i am not sure if the encoding should be done in every non-ascii character or just the reserved. Given that W3 encourages the use of UTF-8 characters (IRIs) and most web browsers (see example) do the encoding automatically for non reserved characters (not sure about other clients). Sites like wikipedia use IRIs since years.

sakiss · 2020-03-12T11:28:40Z

@mbabker If we set some specs and this has chances to be merged in J4, i can contribute.

sakiss mentioned this issue Mar 10, 2020

Uri::toString does not encode special characters joomla/joomla-cms#28252

Closed

sakiss mentioned this issue Oct 29, 2020

[4.0] protected $_router instead of private joomla/joomla-cms#31251

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uri::toString does not encode special characters #22

Uri::toString does not encode special characters #22

sakiss commented Mar 10, 2020 •

edited

Loading

mbabker commented Mar 10, 2020

mbabker commented Mar 10, 2020

sakiss commented Mar 10, 2020

mbabker commented Mar 10, 2020

sakiss commented Mar 11, 2020 •

edited

Loading

sakiss commented Mar 12, 2020 •

edited

Loading

Uri::toString does not encode special characters #22

Uri::toString does not encode special characters #22

Comments

sakiss commented Mar 10, 2020 • edited Loading

Steps to reproduce the issue

Expected result

Actual result

mbabker commented Mar 10, 2020

mbabker commented Mar 10, 2020

sakiss commented Mar 10, 2020

mbabker commented Mar 10, 2020

sakiss commented Mar 11, 2020 • edited Loading

sakiss commented Mar 12, 2020 • edited Loading

sakiss commented Mar 10, 2020 •

edited

Loading

sakiss commented Mar 11, 2020 •

edited

Loading

sakiss commented Mar 12, 2020 •

edited

Loading