You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the bottom of every read from memory is the conversion from bytes to string with .decode('ascii'). This fails extremely loudly when there's a character >0x7e. Strings like this can occur naturally in e.g. elf files found on android systems. To reproduce just copy libc.so.6 and replace the libc.so.6 soname text with libc\xffso.6 or whatever.
Two possible solutions:
return bytes instead of string
replace s.decode('ascii') with ''.join(chr(c) for c in s)
The text was updated successfully, but these errors were encountered:
Using bytes makes sense to me, but there are a couple of gotchas to consider - one is Python 2 vs. 3 compatibility (pyelftools supports both from the same codebase), another is readelf compatibility (how does readelf show these when printed out).
Here's a better, non-artificial testcase: clang will accept valid utf-8 files as input, and will accept unicode characters as part of symbols, encoding the symbol names in the elf as utf-8. Here's the source file, the compiled file, and a pyelftools script that will crash while trying to read the symbols. utf_elf.zip
readelf itself will not crash but it will be extremely unhappy about the situation. The version of it on one machine printed out <CE> (only half-correct, the full utf-8 is CE 94 irc), another version printed �, and another seems like it printed a line feed but not a carriage return. That might be due to terminal issues, though.
Probably the best thing to do is to just utf-8 decode, since it won't break anything that wasn't already broken and there's no better standard for how to interpret a stream of bytes without an encoding...
At the bottom of every read from memory is the conversion from bytes to string with
.decode('ascii')
. This fails extremely loudly when there's a character >0x7e. Strings like this can occur naturally in e.g. elf files found on android systems. To reproduce just copy libc.so.6 and replace the libc.so.6 soname text with libc\xffso.6 or whatever.Two possible solutions:
s.decode('ascii')
with''.join(chr(c) for c in s)
The text was updated successfully, but these errors were encountered: