Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(snapshot): code split #198

Merged
merged 24 commits into from
Dec 11, 2023
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
8c18e89
refactor: extract logger functions
iankressin Nov 23, 2023
b49da54
refactor: extract shortned address to function
iankressin Nov 23, 2023
8ca1b0a
chore: merge with origin branch
iankressin Nov 25, 2023
46a04bd
chore: remove duplicated function declaration
iankressin Nov 25, 2023
63ad6e6
chore: move get_shortnet_target and set_logger_env to common
iankressin Nov 28, 2023
40be8fd
refactor: extract function selector resolvers to resolve.rs
iankressin Dec 1, 2023
0dc9274
refactor: move get_contract_bytecode to a new bytecode module
iankressin Dec 1, 2023
f9b9cfa
refactor: move get_selectors to selectors module
iankressin Dec 1, 2023
07a985d
refactor: make bytecode disambled external from get_resolved_selectors
iankressin Dec 1, 2023
95b4ef2
refactor: remove logger from params of resolve_signature
iankressin Dec 1, 2023
273086d
refactor: rework snapshot mod variable and args declaration order
iankressin Dec 4, 2023
539f362
merge: origin main
iankressin Dec 4, 2023
f023cd6
refactor: remove logger from params list and initialize internally
iankressin Dec 5, 2023
2739e82
fix: remove extrar logger param from get_contract_bytecode call
iankressin Dec 5, 2023
9e810be
chore: change rpc from llama to ankr
iankressin Dec 8, 2023
4f731f3
fix(tests): remove unnecessary `0x` prefix constraint
Jon-Becker Dec 9, 2023
f336a3a
refactor: rename resolve_custom_event_signatures to resolve_event_sig…
iankressin Dec 9, 2023
60857e3
refactor: rename get_contract_bytecode to get_bytecode_from_target
iankressin Dec 9, 2023
c5c6eff
fix: update comments from get_bytecode_from_target to be module agnostic
iankressin Dec 9, 2023
ea577df
refactor: switch from logger::debug_max to debug_max
iankressin Dec 9, 2023
e756416
fix: remove out of context log
iankressin Dec 9, 2023
d4b4812
refactor: remove get_logger_and_trace function
iankressin Dec 9, 2023
0633c2e
style: code format
iankressin Dec 9, 2023
ecb59fe
refactor: change target param from string reference to string slice
iankressin Dec 9, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 84 additions & 0 deletions common/src/ether/bytecode.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
use super::rpc::get_code;
use crate::{
constants::{ADDRESS_REGEX, BYTECODE_REGEX},
utils::io::logging::Logger,
};
use std::fs;

pub async fn get_contract_bytecode(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use this function in other modules as well, such as decompile

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do this in this PR or in a separate one? I can def do on this one, it's just a matter of how you prefer reviewing the code.

the same question stands for get_resolved_selectors and get_shortned_target

target: &str,
rpc_url: &str,
) -> Result<String, Box<dyn std::error::Error>> {
let (logger, _) = Logger::new("");

if ADDRESS_REGEX.is_match(target)? {
// We are snapshotting a contract address, so we need to fetch the bytecode from the RPC
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update comments to be module agnostic

// provider.
get_code(target, rpc_url).await
} else if BYTECODE_REGEX.is_match(target)? {
logger.debug_max("using provided bytecode for snapshotting.");
Ok(target.replacen("0x", "", 1))
} else {
logger.debug_max("using provided file for snapshotting.");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure if debug_max! macro exists in this PR context, but worth noting that i should update these calls later.

see #215 or #216 for details

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's available. I had to merge main in this PR due to some conflicts and already updated some places to use the new macro, but ended up missing this one


// We are snapshotting a file, so we need to read the bytecode from the file.
match fs::read_to_string(target) {
Ok(contents) => {
let _contents = contents.replace('\n', "");
if BYTECODE_REGEX.is_match(&_contents)? && _contents.len() % 2 == 0 {
Ok(_contents.replacen("0x", "", 1))
} else {
logger.error(&format!("file '{}' doesn't contain valid bytecode.", &target));
std::process::exit(1)
}
}
Err(_) => {
logger.error(&format!("failed to open file '{}' .", &target));
std::process::exit(1)
}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::fs;

#[tokio::test]
async fn test_get_bytecode_when_target_is_address() {
let bytecode = get_contract_bytecode(
"0x9f00c43700bc0000Ff91bE00841F8e04c0495000",
"https://rpc.ankr.com/eth",
)
.await
.unwrap();

assert!(BYTECODE_REGEX.is_match(&bytecode).unwrap());
}

#[tokio::test]
async fn test_get_bytecode_when_target_is_bytecode() {
let bytecode = get_contract_bytecode(
"0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001",
"https://rpc.ankr.com/eth",
)
.await
.unwrap();

assert!(BYTECODE_REGEX.is_match(&bytecode).unwrap());
}

#[tokio::test]
async fn test_get_bytecode_when_target_is_file_path() {
let file_path = "./mock-file.txt";
let mock_bytecode = "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001";

fs::write(file_path, mock_bytecode).unwrap();

let bytecode = get_contract_bytecode(file_path, "https://rpc.ankr.com/eth").await.unwrap();

assert!(BYTECODE_REGEX.is_match(&bytecode).unwrap());

fs::remove_file(file_path).unwrap();
}
}
1 change: 1 addition & 0 deletions common/src/ether/mod.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
pub mod bytecode;
pub mod compiler;
pub mod evm;
pub mod lexers;
Expand Down
2 changes: 1 addition & 1 deletion common/src/ether/rpc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ pub async fn get_code(
None,
);

Ok(bytecode_as_bytes.to_string())
Ok(bytecode_as_bytes.to_string().replacen("0x", "", 1))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added this, which normalizes the get_code response and never includes the 0x prefix.

previously, we stored the value in cache without the prefix, and returned it with the prefix. this just keeps everything consistent.

})
.await
.map_err(|_| Box::from("failed to fetch bytecode"))
Expand Down
44 changes: 43 additions & 1 deletion common/src/ether/selectors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,51 @@ use tokio::task;

use crate::utils::{io::logging::Logger, strings::decode_hex};

use super::{evm::core::vm::VM, signatures::ResolveSelector};
use super::{
evm::core::vm::VM,
signatures::{ResolveSelector, ResolvedFunction},
};
use crate::debug_max;

// Find all function selectors and all the data associated to this function, represented by
// [`ResolvedFunction`]
pub async fn get_resolved_selectors(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh nice! we should use this function in other modules (such as decompile)

disassembled_bytecode: &str,
skip_resolving: &bool,
evm: &VM,
shortened_target: &str,
) -> Result<
(HashMap<String, u128>, HashMap<String, Vec<ResolvedFunction>>),
Box<dyn std::error::Error>,
> {
let selectors = find_function_selectors(evm, disassembled_bytecode);

let mut resolved_selectors = HashMap::new();
if !skip_resolving {
resolved_selectors =
resolve_selectors::<ResolvedFunction>(selectors.keys().cloned().collect()).await;

// if resolved selectors are empty, we can't perform symbolic execution
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update comment, we dont do symbolic execution here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, if resolved_selectors is empty, nothing fails. heimdall can guess things!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this log can be removed, right?

if resolved_selectors.is_empty() {
debug_max!(&format!(
"failed to resolve any function selectors from '{shortened_target}' .",
));
}

debug_max!(&format!(
"resolved {} possible functions from {} detected selectors.",
resolved_selectors.len(),
selectors.len()
));
} else {
debug_max!(&format!("found {} possible function selectors.", selectors.len()));
}

debug_max!(&format!("performing symbolic execution on '{shortened_target}' ."));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line, since we aren't actually doing symbolic execution here


Ok((selectors, resolved_selectors))
}

/// find all function selectors in the given EVM assembly.
pub fn find_function_selectors(evm: &VM, assembly: &str) -> HashMap<String, u128> {
let mut function_selectors = HashMap::new();
Expand Down
48 changes: 48 additions & 0 deletions common/src/utils/io/logging.rs
Original file line number Diff line number Diff line change
Expand Up @@ -604,11 +604,48 @@ impl Logger {
}
}

/// Set `RUST_LOG` variable to env if does not exist
///
/// ```
/// use heimdall_common::utils::io::logging::set_logger_env;
///
/// let verbosity = clap_verbosity_flag::Verbosity::new(-1, 0);
/// set_logger_env(&verbosity);
/// ```
pub fn set_logger_env(verbosity: &clap_verbosity_flag::Verbosity) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:D nice!

let env_not_set = std::env::var("RUST_LOG").is_err();

if env_not_set {
let log_level = match verbosity.log_level() {
Some(level) => level.as_str(),
None => "SILENT",
};

std::env::set_var("RUST_LOG", log_level);
}
}

/// Returns a new instance of `Logger` and `TraceFactory` given a log level
///
/// ```
/// use heimdall_common::utils::io::logging::get_logger_and_trace;
///
/// let verbosity = clap_verbosity_flag::Verbosity::new(-1, 0);
/// get_logger_and_trace(&verbosity);
/// ```
pub fn get_logger_and_trace(verbosity: &clap_verbosity_flag::Verbosity) -> (Logger, TraceFactory) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now you can actually just use Logger::new, it'll return (Logger, TraceFactory)

I do have #126 open which will update this functionality eventually (all logger methods being moved to macros, TraceFactory separated from Logger.

Logger::new(match verbosity.log_level() {
Some(level) => level.as_str(),
None => "SILENT",
})
}

#[cfg(test)]
mod tests {
use std::time::Instant;

use super::*;
use std::env;

#[test]
fn test_raw_trace() {
Expand Down Expand Up @@ -887,4 +924,15 @@ mod tests {
let (_logger, _) = Logger::new("MAX");
debug_max!("log");
}

#[test]
fn test_set_logger_env_default() {
env::remove_var("RUST_LOG");

let verbosity = clap_verbosity_flag::Verbosity::new(-1, 0);

set_logger_env(&verbosity);

assert_eq!(env::var("RUST_LOG").unwrap(), "SILENT");
}
}
36 changes: 36 additions & 0 deletions common/src/utils/strings.rs
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,26 @@
TokenType::Function
}

/// Returns a collapsed version of a string if this string is greater than 66 characters in length.
/// The collapsed string consists of the first 66 characters, followed by an ellipsis ("..."), and
/// then the last 16 characters of the original string. ```
/// use heimdall_common::utils::strings::get_shortned_target;
///
/// let long_target = "0".repeat(80);
/// let shortened_target = get_shortned_target(&long_target);
/// ```
pub fn get_shortned_target(target: &String) -> String {

Check warning on line 363 in common/src/utils/strings.rs

View workflow job for this annotation

GitHub Actions / clippy

[clippy] reported by reviewdog 🐶 <pre><code>warning: writing `&String` instead of `&str` involves a new object where a slice will do --> common/src/utils/strings.rs:363:36 | 363 | pub fn get_shortned_target(target: &String) -> String { | ^^^^^^^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg = note: `#[warn(clippy::ptr_arg)]` on by default help: change this to | 363 ~ pub fn get_shortned_target(target: &str) -> String { 364 ~ let mut shortened_target = target.to_owned(); | </code></pre> Raw Output: common/src/utils/strings.rs:363:36:w: <pre><code>warning: writing `&String` instead of `&str` involves a new object where a slice will do --> common/src/utils/strings.rs:363:36 | 363 | pub fn get_shortned_target(target: &String) -> String { | ^^^^^^^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg = note: `#[warn(clippy::ptr_arg)]` on by default help: change this to | 363 ~ pub fn get_shortned_target(target: &str) -> String { 364 ~ let mut shortened_target = target.to_owned(); | </code></pre> __END__
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! let's make sure to use this function throughout the codebase

let mut shortened_target = target.clone();

if shortened_target.len() > 66 {
shortened_target = shortened_target.chars().take(66).collect::<String>() +
"..." +
&shortened_target.chars().skip(shortened_target.len() - 16).collect::<String>();
}

shortened_target
}

#[cfg(test)]
mod tests {
use ethers::types::{I256, U256};
Expand Down Expand Up @@ -691,4 +711,20 @@
assert_eq!(classification, TokenType::Function);
}
}

#[test]
fn test_shorten_long_target() {
let long_target = "0".repeat(80);
let shortened_target = get_shortned_target(&long_target);

assert_eq!(shortened_target.len(), 85);
}

#[test]
fn test_shorten_short_target() {
let short_target = "0".repeat(66);
let shortened_target = get_shortned_target(&short_target);

assert_eq!(shortened_target.len(), 66);
}
}
Loading
Loading