Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting language specific strings from golang #779 #792

Merged
merged 138 commits into from
Jul 11, 2023

Conversation

Arker123
Copy link
Collaborator

@Arker123 Arker123 commented Jun 9, 2023

This pull request (PR) aims to enhance the codebase by implementing a feature to extract language-specific strings from Go (Golang) binaries.

TODO:-

  • Extract strings from the .text segment.
  • Add comprehensive tests for various versions of Go.

floss/get_go_strings.py Outdated Show resolved Hide resolved
floss/get_go_strings.py Outdated Show resolved Hide resolved
floss/get_go_strings.py Outdated Show resolved Hide resolved
floss/get_go_strings.py Outdated Show resolved Hide resolved
floss/get_go_strings.py Outdated Show resolved Hide resolved
floss/get_go_strings.py Outdated Show resolved Hide resolved
floss/get_go_strings.py Outdated Show resolved Hide resolved
floss/get_go_strings.py Outdated Show resolved Hide resolved
floss/get_go_strings.py Outdated Show resolved Hide resolved
@williballenthin
Copy link
Collaborator

also, strongly recommend creating some tests as you're developing this code, to assert that the changes you're making help improve the features declared in the tests.

floss/get_go_strings.py Outdated Show resolved Hide resolved
floss/main.py Outdated Show resolved Hide resolved
floss/language/go/extract.py Outdated Show resolved Hide resolved
floss/language/go/extract.py Outdated Show resolved Hide resolved
floss/language/go/extract.py Outdated Show resolved Hide resolved
@Arker123 Arker123 changed the title Extracting language specific strings from golang Extracting language specific strings from golang #779 Jun 14, 2023
floss/language/go/extract.py Outdated Show resolved Hide resolved
floss/language/go/extract.py Outdated Show resolved Hide resolved
@mr-tz
Copy link
Collaborator

mr-tz commented Jun 15, 2023

Current results look promising, nice progress!

floss/language/go/extract.py Outdated Show resolved Hide resolved
floss/language/go/extract.py Outdated Show resolved Hide resolved
@Arker123
Copy link
Collaborator Author

Arker123 commented Jul 7, 2023

By the way, I'd love to hear your thoughts about suggestion Arker123#3 (comment). Let me know what you think.

@Arker123 Arker123 requested review from williballenthin and mr-tz July 7, 2023 19:30
and the length is reasonable; however, we don't validate the encoded string data.
"""
if psize == 32:
format = "I"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick curiosity: 'I' format works, 'L' doesn't. Why? Thanks!

Copy link
Collaborator

@williballenthin williballenthin Jul 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is for an array (not struct) the formats are declared here: https://docs.python.org/3/library/array.html

I is for 2 byte words and L is for 4 bytes (and Q for 8 bytes). Lowercase variants are for signed versus uppercase for unsigned. here we want 4 byte unsigned since it doesn't make sense for addresses/sizes to be negative (signed).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it seems like we want "L" but I used "I" which seems like a mistake. i should have tested this.

when you say it doesn't work, what are you observing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: 5260cb6 you did use the "L" format initially. However, during my testing and debugging process, I observed that the "L" format was not working as expected. While making some adjustments to the code, I surprisingly discovered that the "I" format magically worked instead. I apologize for not communicating this clearly earlier.

It's still unclear to me why the "I" format ended up working while the "L" format did not.

floss/language/go/coverage.py Outdated Show resolved Hide resolved
floss/language/go/coverage.py Outdated Show resolved Hide resolved
floss/language/go/coverage.py Outdated Show resolved Hide resolved
floss/main.py Outdated Show resolved Hide resolved
floss/main.py Outdated Show resolved Hide resolved
floss/main.py Outdated Show resolved Hide resolved
floss/render/default.py Outdated Show resolved Hide resolved
floss/language/go/extract.py Outdated Show resolved Hide resolved
@Arker123
Copy link
Collaborator Author

Now handle language specific strings separately:-

floss/results.py Outdated Show resolved Hide resolved
@mr-tz
Copy link
Collaborator

mr-tz commented Jul 10, 2023

Doesn't this also print strings twice potentially?

I'd suggest to roll back 722e82a and consider using the proposed changes in Arker123#6.

tests/test_language_extract_go.py Outdated Show resolved Hide resolved
tests/test_language_extract_go.py Outdated Show resolved Hide resolved
tests/test_language_extract_go.py Outdated Show resolved Hide resolved
tests/test_language_extract_go.py Outdated Show resolved Hide resolved
tests/test_language_extract_go.py Outdated Show resolved Hide resolved
tests/test_language_extract_go.py Outdated Show resolved Hide resolved
@Arker123 Arker123 force-pushed the go-strings branch 2 times, most recently from d2a800b to 3463353 Compare July 10, 2023 17:17
@mr-tz mr-tz merged commit 386d86f into mandiant:master Jul 11, 2023
@mr-tz
Copy link
Collaborator

mr-tz commented Jul 11, 2023

Nice work, @Arker123 and @williballenthin!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants