-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing MediaInfo fails on Chinese chars in XML #92
Comments
@Fossil01 Thanks for reporting this issue. An old pull request consider removing |
@mhor nope same thing happens if I remove those 3 lines. |
Thanks for your quick answer, so it's definitively related to xml string returned by mediainfo. |
I'll have a crack at it after Christmas. Cheers. |
@Fossil01 did you have test the fix I've done (PR #93) ? |
Completely forgot about this. It seems to work now. |
Looks like I am still having this issue.
Maybe we can use a function like this to strip out invalid chars: |
Aha. It looks like aca1198 never made it into the master branch and thus in a release. When I add these lines it seems to fix the issue too: $xmlString = preg_replace(
'/[\x00-\x08\x0B\x0C\x0E-\x1F]|\xED[\xA0-\xBF].|\xEF\xBF[\xBE\xBF]/',
"\xEF\xBF\xBD",
$xmlString
); |
XML it fails on currently: <?xml version="1.0" encoding="UTF-8"?>
<Mediainfo version="20.03">
<File>
<track type="General">
<Count>331</Count>
<Count_of_stream_of_this_kind>1</Count_of_stream_of_this_kind>
<Kind_of_stream>General</Kind_of_stream>
<Kind_of_stream>General</Kind_of_stream>
<Stream_identifier>0</Stream_identifier>
<Complete_name>/mnt/ramdisk/1/15f4594a-c211-4acc-9f58-cae2b09c8151/160095_[� Kuro Ookami ] Pet Life [ISO DVD-RIP 1920x1080 x264 10bits AC-3] [69A9399D].mkv</Complete_name>
<Folder_name>/mnt/ramdisk/1/15f4594a-c211-4acc-9f58-cae2b09c8151</Folder_name>
<File_name_extension>160095_[� Kuro Ookami ] Pet Life [ISO DVD-RIP 1920x1080 x264 10bits AC-3] [69A9399D].mkv</File_name_extension>
<File_name>160095_[� Kuro Ookami ] Pet Life [ISO DVD-RIP 1920x1080 x264 10bits AC-3] [69A9399D]</File_name>
<File_extension>mkv</File_extension>
<File_size>1048394</File_size>
<File_size>1 024 KiB</File_size>
<File_size>1 024 KiB</File_size>
<File_size>1 024 KiB</File_size>
<File_size>1 024 KiB</File_size>
<File_size>1 023.8 KiB</File_size>
<Stream_size>1048394</Stream_size>
<Stream_size>1 024 KiB (100%)</Stream_size>
<Stream_size>1 024 KiB</Stream_size>
<Stream_size>1 024 KiB</Stream_size>
<Stream_size>1 024 KiB</Stream_size>
<Stream_size>1 023.8 KiB</Stream_size>
<Stream_size>1 024 KiB (100%)</Stream_size>
<Proportion_of_this_stream>1.00000</Proportion_of_this_stream>
<File_last_modification_date>UTC 2021-10-13 08:47:31</File_last_modification_date>
<File_last_modification_date__local_>2021-10-13 10:47:31</File_last_modification_date__local_>
</track>
</File>
</Mediainfo> |
I'll have a look this week, thanks. In the mean time I manually edited the file in the vendor dir and added that preg_replace I pasted here before as an ugly temp fix :-) |
Closed for now, due to inactivity. |
In the following XML between the tags there are some Chinese chars. SimpleXML doesn't seem to like those and crashes the process.
<Copyright>�꤀ 刀漀渀 䠀愀爀爀椀猀</Copyright>
The text was updated successfully, but these errors were encountered: