You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
String record fields could hold unicode surogate escaped values. E.g. filenames from a filesystem that are not proper utf-8.
Python will surrogate escape these filenames to have them still be a proper Python string type while not losing any information on what the actual filename is.
We also want to preserve this information, so flow.record should deal with surrogate sequences in string fields. It currently converts Python strings to utf-8 encoded byte strings and v.v. when creating the binary record format (or other output formats), and it will throw a UnicodeDecodeError when encountering such strings or byte strings.
The text was updated successfully, but these errors were encountered:
String record fields could hold unicode surogate escaped values. E.g. filenames from a filesystem that are not proper utf-8.
Python will surrogate escape these filenames to have them still be a proper Python
string
type while not losing any information on what the actual filename is.We also want to preserve this information, so flow.record should deal with surrogate sequences in string fields. It currently converts Python strings to utf-8 encoded byte strings and v.v. when creating the binary record format (or other output formats), and it will throw a
UnicodeDecodeError
when encountering such strings or byte strings.The text was updated successfully, but these errors were encountered: