Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type propagation to DataVariables #4400

Open
plafosse opened this issue Jun 8, 2023 · 3 comments
Open

Type propagation to DataVariables #4400

plafosse opened this issue Jun 8, 2023 · 3 comments
Labels
Component: Core Issue needs changes to the core Core: Type Propagation Effort: High Issue should take > 1 month Impact: Medium Issue is impactful with a bad, or no, workaround Type: Enhancement Issue is a small enhancement to existing functionality

Comments

@plafosse
Copy link
Member

plafosse commented Jun 8, 2023

Binary Ninja could do a better job inferring DataVariable types. The primary reason we don't propagate types from analysis to DataVariables is that the DataVariable could be shared amongst multiple functions. Analysis would have to take into account all function which reference the DataVariable in order to make correct assumptions about a DataVariables type. We'd essentially have to wait until all analysis has settled and then go through references to each of these as a second pass.

@plafosse plafosse added Type: Enhancement Issue is a small enhancement to existing functionality Component: Core Issue needs changes to the core Effort: High Issue should take > 1 month Impact: Medium Issue is impactful with a bad, or no, workaround Core: Type Propagation labels Jun 8, 2023
@ccarpenter04
Copy link

The increased accuracy would easily make it worthwhile to be included as a pre-included workflow at the very least.

@0xdevalias
Copy link

0xdevalias commented Jun 17, 2024

Some recent discussion on Slack brought up how analysis.limits.minStringLength sets a minimum for string detection (which seems to default to 4), which explained why a %s in printf wasn't detected; which we can see another example of in a comment on one of the closed duplicate issues:

It seems %s is not considered as a string.
image

Originally posted by @reversing-dev in #4759 (comment)

I asked how 'smart' the current detection methods were, and whether the patterns of usage could be used to 'safely' detect smaller strings:

Eg. if a ‘known string function’ (eg printf) was accessing a really short string (like the %s) example; i’m guessing this would be a ‘hard limit’ and prevent it being detected; but it would seem that if it was a little ‘smarter’ in it’s detection and able to take into context the ‘known string function’ usage, then it could give more ‘weight’ to ‘safely’ identify that the ‘short stringlike bytes’ may actually be legitimately a string.

Which eventually led back to this issue. Just wanted to add a bit more context/keywords to make this easier to find in future.


Also crosslinking to this tangentially related issue that was linked from one of the dupes to keep the references in-tact:

@xusheng6
Copy link
Member

I believe type propagation to data variables is NOT the best solution for detecting small strings -- It only works for those strings used by printf, etc, which we know the type of the argument. For other functions we will still miss it. Also many strings are a target of a pointer data variable (commonly seen in Rust binaries), and we would not see a code xref to it at all

Here is my proposal for solving the problem -- when we do the string search, when we see a short string, we keep it and mark it as a short string, instead of discarding it. Then, later on, during analysis, when we see a code or data xref to the string, then we promote it to a true string. Otherwise it will remain to be a short string, and excluded from the stirngs view and bv.strings. I am likely going to carry this change out along with #5548, some at a certain point post stable 4.1 release

Still, the issue itself is worth doing -- we definitely want to propagate the type info into data variables, but it is just not motivating the small string issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Core Issue needs changes to the core Core: Type Propagation Effort: High Issue should take > 1 month Impact: Medium Issue is impactful with a bad, or no, workaround Type: Enhancement Issue is a small enhancement to existing functionality
Projects
None yet
Development

No branches or pull requests

4 participants