-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not encoding UTF-8 correctly #66
Comments
Right, I will check and try to fix this. |
Not only encoded utf-8 is returned. Garbled code is returned using Reproduce:
Expected:
Actual:
and also with
Result:
while:
Returns encoded utf-8:
|
But it is fine using Code:
Result:
but Code:
Result:
I really like this library very much. It is powerful and provide lots of methods to handle each different DOM. But unluckily, because of this encoding issue, I cannot use it because all my html and html-fragment are full of Chinese. Hope it can be fixed soon. Thanks for your contribution. |
Give me one or two weekends please :-) I will try to add the issues as unit tests to make them reproducible. |
I just pushed FluentDOM/FluentDOM@370e98f This is not the final fix, but it should improve the behavior. |
I want to try your new push, but I cannot use the
when composer update:
I can install the new push without the |
Anyway, I have downloaded the lastest clone and replaced the whole folder manually. I can test now. |
Hi, Any updates? |
I added some RegEx to the HTML loader to fetch the encoding from the meta tags - default is UTF-8 now. Additionally I did a lot of rework on the load/save process from HTML - testing it with Chinese characters. The changes are pushed to the 6.1 branch and I added the version to the CSS Selector package, so a composer install allowing the dev versions should work now. If you could test it out and send me (small) examples that do not work as expected, I would appreciate it. |
@ThomasWeinert Hi, could you give me a sample of
Result:
and, I also tried.
Result:
and, I also tried.
Result:
|
Okay, I can use it now with the following
|
Code to reproduce
Expected Result:
Actual Result:
|
Code to reproduce:
Expected Result:
Actaul Result:
|
any news on this? |
I just pushed a fix for html-fragments: FluentDOM/FluentDOM@9b0daec |
@ThomasWeinert Thanks for your updates. Code to reproduce
ExpectedResult:
Actual Result:
|
That uses a different method of the HTML loader ( |
Thanks. It works GREAT! Code to reproduce$doc = FluentDOM('<div></div>', 'html-fragment');
$doc->html(FluentDOM('hihi', 'html-fragment'));
echo $doc; Expected Result:
Actual Result:
|
I will move the formatting problem to a new issue |
I am a Chinese developer and making Chinese website.
Code to reproduce:
Result:
Expected Result:
It is a known bug of PHP DomDocument. Here is the reference:
http://stackoverflow.com/questions/8218230/php-domdocument-loadhtml-not-encoding-utf-8-correctly
We should get the UTF-8 result instead of getting HTML-ENTITIES result. It doesn't make sense to get the final html with full of encoded utf-8 and making the size much larger.
The text was updated successfully, but these errors were encountered: