Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tweaks the print plugin #991

Merged
merged 2 commits into from
Sep 12, 2019

Conversation

ivg
Copy link
Member

@ivg ivg commented Sep 11, 2019

The print plugin is probably the oldest plugin in BAP, and it was
really crying out for a little bit of attention and love. Not that I
was able to give it what it wants, but here are some tweaks that
hopefully will make our life easier (and nicer). Here is the list of
new features

  1. -dcfg - will now print graphs of all subroutines as subgraphs
    enclosed in partitions, as well as print the interprocedural
    edges. This will also let xdot and other tools that didnt'
    understand several digraphs in row to work with bap directly, e.g.,
    bap ./exe -dcfg | xdot -

  2. -dgraph is the new outputting format which is basically the same
    as -dcfg but without the IR terms (i.e., only basic blocks, no
    def terms). Also partitioned and with interprocedural edges.

  3. tid numbers are properly escaped now (it looks like that dot or
    xdot interprets somehow them even if they are delimited with quotes)

  4. new mechanism for filtering the output, instead of old
    --print-symbol and --print-section (which are still supported) we
    now have a new --print-matching filter that accepts
    <property>:<regex> format. The <property> field could be for
    now one of section, segment, symbol, or name. Where
    symbol now denotes the name of the symbol take from the file
    symbol table or accompanying debugging information, e.g.,
    dwarf. Not that symbol that we reconstructed during the
    disassembling. And name has the same meaning that symbol was
    bearing before, i.e., the name of a subroutine. Concerning the
    regular expression, the accepted syntax is PCRE with partial
    matches, e.g., --print-matching=section:text will match with
    .text and texting. Use \b to specify word boundaries, e.g.,
    --print-matching='symbol:\bmain\b' will print only main (and not
    __libc_start_main). Do not forget to delimit the regex with
    single quotes, to prevent your shell from ringing the bell.

  5. fixes a bug in the ADT representation of Tids.

A Bug in Graphviz

A side note on graphviz dot (and xdot). Due to a 20 years old
bug (supposedly finally fixed in version 2.40) dot is usually unable
to render more or less complex graphs. This bug is triggered randomly
and depends on many variables (like fonts, subroutine names, etc). It
manifests itself as a message (not visible when xdot is used) and
afterwards dot (and correspondigly xdot) will just hang up and stale
for infinity

Error: trouble in init_rank
	%0 1
	%0 1
	%0 1
        <lots of tid-looking identifiers>

I was using version 2.40 (via docker alpine) and was able to render
some complex graphs, but I believe I'm still hitting this issue on
some graphs even in version 2.40.

The print plugin is probably the oldest plugin in BAP, and it was
really crying out for a little bit of attention and love. Not that I
was able to give it what it wants, but here are some tweaks that
hopefully will make our life easier (and nicer). Here is the list of
new features

1) `-dcfg` - will now print graphs of all subroutines as subgraphs
   enclosed in partitions, as well as print the interprocedural
   edges. This will also let `xdot` and other tools that didnt'
   understand several digraphs in row to work with bap directly, e.g.,
   `bap ./exe -dcfg | xdot -`

2) `-dgraph` is the new outputting format which is basically the same
   as `-dcfg` but without the IR terms (i.e., only basic blocks, no
   def terms). Also partitioned and with interprocedural edges.

3) tid numbers are properly escaped now (it looks like that dot or
   xdot interprets somehow them even if they are delimited with quotes)

4) new mechanism for filtering the output, instead of old
   `--print-symbol` and `--print-section` (which are still supported) we
   now have a new `--print-matching` filter that accepts
   `<property>:<regex>` format. The `<property>` field could be for
   now one of `section`, `segment`, `symbol`, or `name`. Where
   `symbol` now denotes the name of the symbol take from the file
   symbol table or accompanying debugging information, e.g.,
   dwarf. Not that symbol that we reconstructed during the
   disassembling. And `name` has the same meaning that `symbol` was
   bearing before, i.e., the name of a subroutine. Concerning the
   regular expression, the accepted syntax is PCRE with partial
   matches, e.g., `--print-matching=section:text` will match with
   `.text` and `texting`. Use `\b` to specify word boundaries, e.g.,
   `--print-matching='symbol:\bmain\b' will print only `main` (and not
   `__libc_start_main`). Do not forget to delimit the regex with
   single quotes, to prevent your shell from ringing the bell.

A Bug in Graphviz
=================

A side note on graphviz dot (and xdot). Due to a 20 years old
[bug][1] (supposedly finally fixed in version 2.40) dot is usually unable
to render more or less complex graphs. This bug is triggered randomly
and depends on many variables (like fonts, subroutine names, etc). It
manifests itself as a message (not visible when xdot is used) and
afterwards `dot` (and correspondigly xdot) will just hang up and stale
for infinity
```
Error: trouble in init_rank
	%0 1
	%0 1
	%0 1
        <lots of tid-looking identifiers>
```

I was using version 2.40 (via docker alpine) and was able to render
some complex graphs, but I believe I'm still hitting this issue on
some graphs even in version 2.40.

[1]: ellson/MOTHBALLED-graphviz#1213
Since the release of BAP 2.0 bap-python integration was broken, as
Tid.name now returns a parseable representation of the knowledge
object and, as a result, we were getting something like
`Tid(0x%deadbeef)` as the input in python.
@ivg
Copy link
Member Author

ivg commented Sep 12, 2019

I've noticed that BAP 2.x breaks python integration due to incorrect printing of tids, since I'm here I decided to fix it in the same PR, but @gitoleg don't squash these two commits, but do "Rebase and Merge", so that we can track those changes even after the merge (in case if we would like to revert one or another)

@gitoleg gitoleg merged commit 69fde6f into BinaryAnalysisPlatform:master Sep 12, 2019
@ivg ivg deleted the tweaks-print-plugin branch June 10, 2020 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants