Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect transform unicode url #45

Closed
tynopet opened this issue Jun 27, 2017 · 4 comments
Closed

Incorrect transform unicode url #45

tynopet opened this issue Jun 27, 2017 · 4 comments

Comments

@tynopet
Copy link

tynopet commented Jun 27, 2017

Hello, I try to create a link with Unicode characters:
https://myabstractwiki.ru/index.php/Заглавная_страница.
After the copy, this link from navigation string and paste to textile link transform to
https://myabstractwiki.ru/index.php/%D0%97%D0%B0%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B0%D1%8F_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0.
But the link in rendered HTML looks incorrect:
https://myabstractwiki.ru/index.php/%C3%90%C2%97%C3%90%C2%B0%C3%90%C2%B3%C3%90%C2%BB%C3%90%C2%B0%C3%90%C2%B2%C3%90%C2%BD%C3%90%C2%B0%C3%91%C2%8F_%C3%91%C2%81%C3%91%C2%82%C3%91%C2%80%C3%90%C2%B0%C3%90%C2%BD%C3%90%C2%B8%C3%91%C2%86%C3%90%C2%B0.
I suspect that this is due to double encoding in Unicode (see line 945 - 948):

path = '/'.join(  # could be encoded slashes!
            quote(unquote(pce).encode('utf8'), b'')
            for pce in parsed.path.split('/')
        )

How to fix this problem?
Thanks.

@ikirudennis
Copy link
Member

Before I start digging into this, I'd like to confirm what the test is. Does the following generate the bug for you?

"test":https://myabstractwiki.ru/index.php/Заглавная_страница

txstyle.org turns that into <p><a href="https://myabstractwiki.ru/index.php/%D0%97%D0%B0%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B0%D1%8F_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0">test</a></p>. And so far that's what I'm getting with the current version of textile. Are you using the latest version?

@tynopet
Copy link
Author

tynopet commented Jun 28, 2017

Yes, I use latest version textile. Parser on textile.org working correctly. But python library works incorrectly. For example repl output:

dmitry@dmitry:~$ python
Python 2.7.13 (default, Jan 19 2017, 14:48:08) 
[GCC 6.3.0 20170118] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import textile
>>> str = '"test":https://myabstractwiki.ru/index.php/%D0%97%D0%B0%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B0%D1%8F_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0'
>>> print textile.textile(str)
	<p><a href="https://myabstractwiki.ru/index.php/%C3%90%C2%97%C3%90%C2%B0%C3%90%C2%B3%C3%90%C2%BB%C3%90%C2%B0%C3%90%C2%B2%C3%90%C2%BD%C3%90%C2%B0%C3%91%C2%8F_%C3%91%C2%81%C3%91%C2%82%C3%91%C2%80%C3%90%C2%B0%C3%90%C2%BD%C3%90%C2%B8%C3%91%C2%86%C3%90%C2%B0">test</a></p>

As you can see, the value of the href attribute is different.

I suspect that the textile.org uses a PHP parser that works correctly.

ikirudennis added a commit that referenced this issue Jul 4, 2017
ikirudennis added a commit that referenced this issue Jul 4, 2017
@ikirudennis
Copy link
Member

This is fixed for now. I don't know when I'll push out a new update, but if you need an urgent fix for this, you can use pip install git+https://github.com/textile/python-textile.git@82b15458faa1efa7d2f8fce16347ad01299199c1#egg=textile to install a version with the fix.

@tynopet
Copy link
Author

tynopet commented Jul 4, 2017

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants