-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimizes reconstructor, symtab, and brancher performance #855
Conversation
This PR rectifies reconstructor, symtab and brancher with a respect to performance, without adding new behaviour or breaking of existed one. Rewrote it and reduced a number of iterations over nodes/edges of cfg There was a bit inefficient implementation of `add_symbol` function, so every addition of a symbol led to a filter of the whole table, although there are enough info to reduce such calls. Just a small fix that check if an instruction has jumps at all before subsequent call of `fold_consts` that could be heavy for some of instructions
else init | ||
|
||
let find_calls cfg roots = | ||
let roots = List.fold ~init:Addr.Set.empty ~f:Set.add roots in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is Addr.Set.of_list
for this.
let addr = Block.addr blk in | ||
let term = Block.terminator blk in | ||
let init = | ||
if Set.mem roots addr || Seq.is_empty (Cfg.Node.inputs blk cfg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a block address is already in the set of roots then there is no need to add to the set of roots.
also, to count the indegree of a block use the specialized function Cfg.Node.degree ~dir:In
, thus this function
could be defined as simple as
let call_destinations cfg blk =
if not Insn.(is call) (Block.terminator blk) then []
else Seq.to_list_rev @@ Seq.filter_map (Cfg.Node.outputs blk cfg) ~f:(fun e ->
Option.some_if (Cfg.Edge.label e <> `Fall) (Cfg.Edge.dst e))
let roots_of_blk cfg blk = Addr.Set.of_list @@ List.concat [
if Cfg.Node.degree ~dir:`In = 0 then Block.addr blk else [];
call_destinations cfg blk
]
|
||
let find_calls cfg roots = | ||
let roots = List.fold ~init:Addr.Set.empty ~f:Set.add roots in | ||
Graphlib.depth_first_search (module Cfg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need DFS for that as you do not need any specific ordering. Just iterate over all nodes. Also, you don't need roots, to check that root is root and add it to roots. This function is simply:
let find_calls cfg roots = Cfg.nodes cfg |> Seq.fold ~init:roots ~f:(fun roots n ->
Set.union roots (roots_of_node n))
~start_tree:(fun n t -> | ||
if Block.equal n entry | ||
then Cfg.Node.insert n t | ||
else return t)) in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the main difference here is that instead of checking each destination for is it a call or not, we just iterate over all calls and ask for incoming edges of a call. That makes sense, and this is indeed faster, provided that majority of calls are reachable from the whole program CFG.
lib/bap_disasm/bap_disasm_symtab.ml
Outdated
let add_symbol t (name,entry,cfg) : t = | ||
let data = name,entry,cfg in | ||
let t = remove t data in | ||
let t = filter t data in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please justify why we can use filter
instead of remove
here? It's not quite obvious to me.
lib/bap_disasm/bap_disasm_symtab.ml
Outdated
} | ||
|
||
let filter t ((name,entry,_ ) as fn) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so from the IRL discussion, the remove function is inefficient since it filters the memory mapping even if the symbol is not in the symtab, the fact that could be checked very fast with the Set membership operation.
However, this is the problem of the remove
function, that should be fixed on its side. So, please remove the filter function and add the corresponding check to the remove
function (in fact a single test for inclusion in addrs
would be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple of bad naming choices, but otherwise good to go.
also, you might consider using a hash table instead of set for is_call
if it gives a significant boost.
let w = Block.addr (Cfg.Edge.dst e) in | ||
Hashtbl.set starts ~key:w ~data:(name w))); | ||
starts | ||
let callees_of_block cfg roots blk = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bad choice of name, since you're adding the block itself to the resulting set it is no longer a callees_of_block. Suggestion, use entries_of_block
since you're effectively collecting the entry nodes.
Also, instead of doing lots of set unions you can pass the set along and use Set.add
. It is more efficient.
let update_callees cfg roots callees blk = | ||
Set.union callees (callees_of_block cfg roots blk) | ||
|
||
let find_callees cfg roots = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to collect_entries
let callees = find_callees cfg roots in | ||
let is_call e = Set.mem callees (Cfg.Edge.dst e) in | ||
let rec traverse fng node = | ||
let fng = Cfg.Node.insert node fng in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, don't use abbreviations that are not common, suggestions,
graph
or g
or prog
for the whole program graph
cfg
for the CFG.
Also, not edg
, either edge
or e
.
Symtab.add_symbol syms (name,entry,cfg)) | ||
let callees = find_callees cfg roots in | ||
let is_call e = Set.mem callees (Cfg.Edge.dst e) in | ||
let rec traverse fng node = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not traverse
as this function builds a new graph. The better name reachable
or add_reachable
or just add
.
if is_call edg then fng | ||
else | ||
let dst = Cfg.Edge.dst edg in | ||
let visited = Cfg.Node.mem dst fng in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the following is easier to read
if is_call edge then cfg
else
let cfg' = Cfg.Edge.insert edge cfg in
if Cfg.Node.mem (Cfg.Edge.dst edge) cfg then cfg'
else add cfg' node
so, what about time - the new reconstructor on my machine with our favorite |
relative timing would be much more interesting. 25 seconds out of 3 hours is much different from 25 seconds out of 30 seconds. |
This PR rectifies reconstructor, symtab and brancher with a respect to performance, without addition of a new behavior or breaking of an existing one.
Changes:
reconstructor
Rewrote it and reduced the number of iterations over nodes/edges of cfg.
The main point is how we build the
filtered
cfg. Previously, we did like the following: take all nodes, add them ony-by-one to an empty cfg. So it was an iteration over all nodes of cfg. Then iterate over all edges, and add each of them if their's destination block is not root.Also, we did one more iteration over all nodes in
find_calls
function.So what I'm proposing here is just to iterate once over the whole cfg to find all roots. and then build the
filtered
cfg by iterating over the roots and removing their input edges.symtab
There was a bit inefficient implementation of
add_symbol
function, so every addition of a symbol called filter on the whole table, although there were enough info to reduce such calls.Concerning details. Every time we add symbol to the symtab we also remember it's name and address of an entry block. Also we filter
memmap
from previously added entries,that have same name and entry block. But, as we remembered the address of the
entry block and the name of the symbol, we can check, if
memmap
needed to befiltered without event touching it!
bracnher
Just a small fix that checks if an instruction has jumps at all before the subsequent call of
fold_consts
that could be heavy for some of the instructions