Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode invalid URLs (unicode) #138

Closed
makew0rld opened this issue Dec 7, 2020 · 6 comments
Closed

Encode invalid URLs (unicode) #138

makew0rld opened this issue Dec 7, 2020 · 6 comments
Labels
bug Something isn't working enhancement New feature or request
Milestone

Comments

@makew0rld
Copy link
Owner

If the user provides a URL like gemini://example.com/蛸 or gemini://example.com/test?蛸, then Amfora should detect that the URL is invalid, and encode it for them. The new encoded URL should be used everywhere, including in the bottom bar.

@makew0rld makew0rld added the enhancement New feature or request label Dec 7, 2020
@makew0rld makew0rld added this to the v1.8.0 milestone Dec 7, 2020
@makew0rld makew0rld added the bug Something isn't working label Dec 7, 2020
@makew0rld
Copy link
Owner Author

Currently, Amfora will actually send invalid URLs like the ones mentioned above to the server. This is invalid/buggy behaviour.

@makew0rld makew0rld modified the milestones: v1.8.0, v1.7.0 Dec 7, 2020
@makew0rld
Copy link
Owner Author

Amfora will also accept URLs like gemini://gemini.circumlunar.space/%64%6f%63%73/%66%61%71%2e%67%6d%69. It's perfectly valid, but it would be nice if it was decoded before showing to the user and sending to the server, since it's all ASCII. Is there an easy way to detect and do this?

@makew0rld
Copy link
Owner Author

makew0rld commented Dec 8, 2020

Another thing to consider is that currently when a space is typed in the bottom bar, Amfora interprets that as a search query instead of a URL. With this new feature where invalid URLs will be converted, Amfora will have to add some more logic to detect the difference between a search and a non-encoded URL with a space.

Since spaces are relatively rare, maybe a space could only be interpreted as part of a URL in these cases:

  • When it starts with gemini:// or //
  • If it contains a period with no spaces before it at all, and none directly after it, and a forward slash
    • Valid: example.com/path with spaces/, example.com/path?query string, example.com/ path/
    • Invalid: foo bar, and/or, test. testing

@makew0rld
Copy link
Owner Author

makew0rld commented Dec 16, 2020

NFC normalization should also happen, before anything else. See this comment for details on implementing that.

@makew0rld
Copy link
Owner Author

New search logic still needs to be implemented.

@makew0rld makew0rld reopened this Dec 20, 2020
@makew0rld
Copy link
Owner Author

That new logic was added in a0ae0ca.

ThomasAdam added a commit to ThomasAdam/amfora that referenced this issue Feb 6, 2021
When amforma is given a string to search, if that string contains a
valid protocol (gemini://), and that string contains trailing
whitespace, then the string is treated as a search term.

Although perhaps slightly more uncommon, if the input string was as a
result of copy/paste then it's possible the string could contain
trailing spaces, which is not what was intended, but rather should be
removed so that it's treated either as a valid gemini:// link or a
search term.

Some efforts around this appeared in makew0rld#138
ThomasAdam added a commit to ThomasAdam/amfora that referenced this issue Feb 8, 2021
When amfora is given a string to search, if that string contains a
valid protocol (gemini://), and that string contains trailing
whitespace, then the string is treated as a search term.

Although perhaps slightly more uncommon, if the input string was as a
result of copy/paste then it's possible the string could contain
trailing spaces, which is not what was intended, but rather should be
removed so that it's treated either as a valid gemini:// link or a
search term.

Some efforts around this appeared in makew0rld#138
makew0rld pushed a commit that referenced this issue Feb 8, 2021
When amfora is given a string to search, if that string contains a
valid protocol (gemini://), and that string contains trailing
whitespace, then the string is treated as a search term.

Although perhaps slightly more uncommon, if the input string was as a
result of copy/paste then it's possible the string could contain
trailing spaces, which is not what was intended, but rather should be
removed so that it's treated either as a valid gemini:// link or a
search term.

Some efforts around this appeared in #138
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant