-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve shellbags plugin #470
Improve shellbags plugin #470
Conversation
A target system can contain non utf-8 characters in the shellbag volume_name. This would break the shellbags plugin and is now fixed.
@@ -617,7 +617,7 @@ def __init__(self, buf): | |||
if self.type == 0x2E: | |||
self.identifier = uuid.UUID(bytes_le=buf[4:20].tobytes()) | |||
else: | |||
self.volume_name = self.fh.read(20).rstrip(b"\x00").decode() | |||
self.volume_name = self.fh.read(20).rstrip(b"\x00").decode(errors="backslashreplace") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a surrogateescape
not be preferred?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly surrogateescape
does not work with ASCII incompatible encodings. Why would you prefer surrogates in this instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use surrogateescape
everywhere else and it's also the default error mode for filesystems. I'm not aware of any limitations with surrogateescape
, do you have an example of that?
I'm definitely not an expert on decoding error modes, so I'm mostly going by with what I know works and what we use elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using surrogateescape
here would not repair the shellbags plugin. You would still get the following error UnicodeDecodeError: 'utf-8' codec can't encode characters in position x-y: surrogates not allowed
when encoding the record for the record stream. (https://stackoverflow.com/a/31898719)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL, thanks for the explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you compare the two in the dissect.target
codebase I think it would be incorrect to state that surrogateescape
is the de facto 'default'.
https://github.com/search?q=repo%3Afox-it%2Fdissect.target+surrogateescape&type=code
https://github.com/search?q=repo%3Afox-it%2Fdissect.target%20%22backslashreplace%22&type=code
I agree it would be nice if this gets a proper fix in flow.record
eventually. Our current situation is that the shellbags
plugin is broken as no encoding error handler is used when dealing with foreign volume names.
Using surrogateescape
would still break the shellbags
plugin as it is currently incompatible with flow.record
. Not using any error handling keeps this plugin broken for us.
In my opinion using errors=backslashreplace
is better than using no error handler at all. Please consider merging this as a temporary solution while we wait for a better fix in flow.record
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get an update on this, considering it's been 7 months.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a PR in flow.record (fox-it/flow.record#144) to properly deal with surrogateescapes for all/most output formats, as this is information we really do want to preserve. A version of flow record with this functionality should land within some reasonable time in dissect.target.
We can merge this PR beforehand, but only with errors="surrogateescape"
. Already flow.record can deal with these when using a json based output, so those can be used in the meantime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not had the opportunity yet to test fox-it/flow.record#144 in combination with errors="surrogateescape"
or errors="surrogatepass"
but that seems like a reasonable fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented in f073796. Also added more tests for this issue and added typing to the plugin.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #470 +/- ##
==========================================
+ Coverage 76.50% 76.66% +0.15%
==========================================
Files 314 314
Lines 26945 26948 +3
==========================================
+ Hits 20615 20659 +44
+ Misses 6330 6289 -41
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
A target system can contain non-utf-8 characters in a shellbag volume
volume_name
. This would break theshellbags
plugin and is now fixed by escaping usingbackslashreplace
.