Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[filetests]: attempt to compile run-command -> wat -> zkasm #232

Merged
merged 11 commits into from
Mar 13, 2024
26 changes: 18 additions & 8 deletions cranelift/filetests/src/test_run_zkasm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ use cranelift_reader::{parse_run_command, RunCommand, TestCommand, TestFile};
use std::borrow::Cow;

use crate::zkasm_codegen;
use crate::zkasm_runner::{self, ExecutionStatus};

struct TestRunZkasm;

Expand Down Expand Up @@ -47,6 +48,7 @@ impl SubTest for TestRunZkasm {
) -> anyhow::Result<()> {
let mut zkasm_functions: Vec<Vec<String>> = Vec::new();
let mut invocations: Vec<Vec<String>> = Vec::new();
let mut invoke_names: Vec<String> = Vec::new();
for (func, details) in &testfile.functions {
zkasm_functions.push(zkasm_codegen::compile_clif_function(func));
for comment in details.comments.iter() {
Expand All @@ -56,22 +58,30 @@ impl SubTest for TestRunZkasm {
todo!()
}
RunCommand::Run(invoke, compare, expected) => {
invoke_names.push(zkasm_codegen::invoke_name(&invoke));
invocations
.push(zkasm_codegen::compile_invocation(invoke, compare, expected));
}
}
}
}
}
let zkasm_program = zkasm_codegen::build_test_zkasm(zkasm_functions, invocations);
let main = zkasm_codegen::build_main(invoke_names);
let zkasm_program = zkasm_codegen::build_test_zkasm(zkasm_functions, invocations, main);
println!("{}", zkasm_program);
// TODO: instead of printing run program, using something like this:
// match zkasm_runner::run_zkasm(&zkasm_program) {
// // TODO: Probably here is a good place to collect info generated by assert-hostfunction
// // and somehow show it
// Ok(_) => Ok(()),
// Err(e) => Err(e),
// }
let execution_result = zkasm_runner::run_zkasm(&zkasm_program).unwrap();
match execution_result.error {
Some(err) => panic!("Zkasm runtime error: {}", err),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why panic here, when the function returns a Result<?> already?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand error field is used to store zkasm runtime error. And run_zkasm will successfully return something even if runtime error occurred during zkasm execution. Here I want to panic if this runtime error happened

None => (),
};
// TODO: Maybe it is better to create "Unknown error" as error in case when
// status is RuntimeError and current error is None, and remove this filed?
match execution_result.status {
ExecutionStatus::Success => (),
ExecutionStatus::RuntimeError => panic!("Unknown zkasm runtime error"),
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This match seems superfluous, but if you want to keep it, consider:

debug_assert!(matches!(execution_result.status, ExecutionStatus::Success));

// TODO: Probably here is a good place to collect info generated by assert-hostfunction
// and somehow show it
Ok(())
}

Expand Down
164 changes: 128 additions & 36 deletions cranelift/filetests/src/zkasm_codegen.rs
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,8 @@ fn fix_relocs(
if let FinalizedRelocTarget::ExternalName(ExternalName::User(name)) = reloc.target {
let name = &params.user_named_funcs()[name];
if name.index == 0 {
// Codegen line after migrating to new assert:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Codegen line after migrating to new assert:
// TODO(#1234): Codegen line after migrating to new assert:

Would be great to have a reference to something explaining what the plan for the new assert is and reference it here.

// b" $${assert_eq(A, B, label)}".to_vec()
b" B :ASSERT".to_vec()
} else {
format!(" zkPC + 2 => RR\n :JMP(function_{})", name.index)
Expand Down Expand Up @@ -281,68 +283,158 @@ pub fn compile_clif_function(func: &Function) -> Vec<String> {
// TODO: I believe it can be done more beautiful way
let mut funcname = func.name.to_string();
funcname.remove(0);
funcname.push(':');
let mut res = vec![funcname];
let mut res = vec![format!("{}:", funcname)];
res.append(&mut lines);
res.into_iter()
.map(|s| s.replace("label", &format!("label_{}", funcname)))
.collect()
Comment on lines +300 to +302
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to avoid same label names in different functions? (As per TODO of compile_clif_function.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, here I replace functions names generated by generate_zkasm same for all compiled run commands. Also, it helps to make name of compiled clif function same in all it's calls in run command

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to mention that in a comment. Otherwise it is hard to follow why labels are renamed (in that particular way).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me like we should be doing this within codegen outside of tests if possible. After all we'll eventually want to use the zkasm binaries externally too, not just for tests. But I think it is fine to keep this logic here for the time being.

We have a mildly relevant #157 but it would be great to track this as its own issue.

}

/// Builds main for test program
pub fn build_main(invoke_names: Vec<String>) -> Vec<String> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module zkasm_codegen is used outside of tests (e.g. benchmarking) and will likely be moved out of the filetests crate, see #237. Therefore it would be good for the name (build_main) to indicate that the generate main should be used in tests only.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can go even further and separate all function related to testing only in other file, zkasm_testgen or something like that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #245 about it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn build_main(invoke_names: Vec<String>) -> Vec<String> {
pub fn build_test_main(invoke_names: Vec<String>) -> Vec<String> {

seems like a good enough compromise for now that lets us move forward without spinning our wheels on this.

let mut res = vec![
"main:".to_string(),
" SP - 1 => SP".to_string(),
" RR :MSTORE(SP)".to_string(),
];
for name in invoke_names {
res.push(" zkPC + 2 => RR".to_string());
res.push(format!(" :JMP({})", name));
}
res.push(" $ => RR :MLOAD(SP)".to_string());
res.push(" SP + 1 => SP".to_string());
res.push(" :JMP(RR)".to_string());
res
}

// TODO: this function should be much rewrited,
// now it works for one very basic case:
// Simple progam which don't contain globals or some other speciefic preamble\postamble
// Program don't need helper functions (for example 2-exp.zkasm)
// How to fix it? Use generate_preamble and provide correct inputs for it.
/// Builds zkASM used in filetests.
pub fn build_test_zkasm(functions: Vec<Vec<String>>, invocations: Vec<Vec<String>>) -> String {
/// Generate invoke name in format <function_name>_<arg>_..._<arg>
pub fn invoke_name(invoke: &Invocation) -> String {
let mut res = invoke.func.clone();
for arg in &invoke.args {
res.push_str(&format!("_{}", arg));
}
res
}

/// Assembles all parts of zkasm test program together
pub fn build_test_zkasm(
functions: Vec<Vec<String>>,
invocations: Vec<Vec<String>>,
main: Vec<String>,
) -> String {
// TODO: use generate_preamble to get preamble
let preamble = "\
start:
0xffff => SP
zkPC + 2 => RR
:JMP(main)
:JMP(finalizeExecution)";
let mut main = vec![
"main:".to_string(),
" SP - 1 => SP".to_string(),
" RR :MSTORE(SP)".to_string(),
];
for invocation in invocations {
main.extend(invocation);
}
main.push(" SP - 1 => SP".to_string());
main.push(" :JMP(RR)".to_string());
let mut postamble = generate_postamble();
let mut program = vec![preamble.to_string()];
program.append(&mut main);
program.extend(main);
for inv in invocations {
program.extend(inv);
}
for foo in functions {
program.extend(foo);
}
program.append(&mut postamble);
program.join("\n")
}

fn runcommand_to_wasm(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: runcommand_to_wat would be more appropriate.

invoke: Invocation,
_compare: Comparison,
expected: Vec<DataValue>,
) -> String {
// TODO: support different amounts of outputs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW we don't need to do this for near use-cases, multi-value is an extension over the wasm-core-1 spec and nearcore runtimes don't enable it. I guess an exception to this might be if 0 outputs are expected...

let res_bitness = match expected[0] {
DataValue::I32(_) => "i32",
DataValue::I64(_) => "i64",
_ => unimplemented!(),
};
let func_name = invoke.func;
let expected_result = expected[0].clone();
let mut arg_types = String::new();
let mut args_pushing = String::new();
for arg in &invoke.args {
let arg_type = match arg {
DataValue::I32(_) => "i32",
DataValue::I64(_) => "i64",
_ => unimplemented!(),
};
arg_types.push_str(arg_type);
arg_types.push_str(" ");

args_pushing.push_str(&format!("{arg_type}.const {arg}\n "));
}
if arg_types.len() > 0 {
arg_types.pop();
}
// TODO: remove line with 8 whitespaces in the end of args_pushing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would approach this in a way that makes this code read nicely and if we want to see nicely formatted wat output there're tools (wabt, wasm-tools) that will parse a messy one and output a nicely formatted version for you.

let wat_code = format!(
r#"(module
(import "env" "assert_eq" (func $assert_eq (param {res_bitness} {res_bitness})))
(import "env" "{func_name}" (func ${func_name} (param {arg_types}) (result {res_bitness})))
(func $main
{args_pushing}
call ${func_name}
{res_bitness}.const {expected_result}
call $assert_eq
)
(start $main)
)"#,
args_pushing = args_pushing,
arg_types = arg_types,
res_bitness = res_bitness,
func_name = func_name,
expected_result = expected_result,
);
wat_code.to_string()
}

/// Compiles a invocation.
pub fn compile_invocation(
invoke: Invocation,
compare: Comparison,
expected: Vec<DataValue>,
) -> Vec<String> {
// Here I assume that each "function" in zkasm gets it's arguments from first N registers
// and put result in A.
// TODO: should be more robust way to do it, we need somehow define inputs and outputs
let mut res: Vec<String> = Default::default();
let registers = vec!["A", "B", "C", "D", "E"];
// TODO: don't do this clones
let cmp = if compare == Comparison::Equals {
Comparison::Equals
} else {
Comparison::NotEquals
};
let inv = Invocation {
func: invoke.func.clone(),
args: invoke.args.clone(),
};
let wat = runcommand_to_wasm(inv, cmp, expected.clone());
let wasm_module = wat::parse_str(wat).unwrap();

let settings = ZkasmSettings::default();

let args = invoke.args;
// TODO: we should not use generate_zkasm itself, but a bit changed version?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding filetest specific fields to ZkasmSettings might help to achieve the desired behavior.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope so, but think better do it in separate PR

let generated: Vec<String> = generate_zkasm(&settings, &wasm_module)
.split("\n")
.map(|s| s.to_string())
.collect();
let new_label = invoke_name(&invoke);
let funcname = invoke.func;

// TODO: here we should pay attention to type of DataValue (I64 or I32)
for (idx, arg) in args.iter().enumerate() {
res.push(format!(" {} => {}", arg, registers[idx]))
}
res.push(format!(" :JMP({})", funcname));
// TODO: handle functions with multiple outputs
res.push(format!(" {} => B", expected[0]));
// TODO: replace with call to host function
res.push(format!(" CALL AWESOME ASSERT ({})", compare));
res
// TODO: will this always function_2?
let start_index = generated.iter().position(|r| r == "function_2:").unwrap();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generated_function is the function that should be invoked? Relying on the generated name of that function seems brittle.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also think it shouldn't work by this name. However, I think in next PR with switching to generate_preamble with proper arguments, there will be some more reliable way to do it

let end_index = generated.iter().rposition(|r| r == " :JMP(RR)").unwrap();
let mut generated_function = generated[start_index..=end_index].to_vec();

generated_function[0] = format!("{}:", new_label);
// TODO: will this always function_1?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is function_1 expected to be?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When compiling such .wat snippets, result was look like (at least for programs I was testing):

;; preamble
function_2:
  ;; put values to the registers
  ;; JMP(function_1)
  ;; assert_helper
  ;; JMP(RR)

From one side, while .wat code have only have 3 functions, and all of them always appear in same order, it is quite natural that the functions always named same way. From the other side, I hope to find more reliable way to do it once finishing preamble generation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thinks it's quite hard to follow what happens in compile_invocation, even more since it relies on implementation details of codegen. The explanations from your comments would be suited for comments in the code. Ideally the comments answer these questions:

  • What is the invocation compiled to? To me it's not clear from context or return type.
  • How is the generated zkASM expected to look like?
  • What are function_1 and function_2 expected to be?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function generally seems like its trying to tip toe around our own code, which is a clear indication of an abstraction that's working against us. For instance, we appear to be parsing the output in an attempt to determine function boundaries, but nothing stops e.g. JMP(RR) to be used for something else than returning to a caller... LLVM for example especially likes a pattern similar to

    push rr
    rr <= addrof(done)
    call some_scope
some_scope:
    do_whatever_with_non-trivial control flow...
    jmp(rr)...

done:
    pop(rr)

within a single function in certain scenarios (e.g. to implement jumptables) and there is no reason why clift couldn't learn these tricks either.

IIRC cranelift already works on a per-function basis anyway so it might very well be worthwhile to revisit whether generate_zkasm makes sense if it interferes with use-cases as straightforward as this one. In fact from what I can see working around it is more code than replicating what generate_zkasm does in the first place... So perhaps splitting it up slightly would be a simpler approach?

let generated_replaced: Vec<String> = generated_function
.iter()
.map(|s| {
s.replace("label", &new_label)
.replace("function_1", &funcname)
})
.collect();
generated_replaced
}
6 changes: 3 additions & 3 deletions cranelift/filetests/src/zkasm_runner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,9 @@ pub struct ExecutionResult {
/// Path to the main zkAsm file that was executed.
path: String,
/// Status of the execution.
status: ExecutionStatus,
pub status: ExecutionStatus,
/// Error message in case the execution failed.
error: Option<String>,
pub error: Option<String>,
/// Profiling information about this execution.
/// Only populated for the successful executions.
counters: Option<Counters>,
Expand Down Expand Up @@ -76,7 +76,7 @@ pub fn run_zkasm_path(input_path: &Path) -> anyhow::Result<Vec<ExecutionResult>>
let mut output_file = NamedTempFile::new()?;
let common_args = [
"--prefix",
"../../tests/zkasm",
"../tests/zkasm",
"test",
input_path.to_str().unwrap(),
output_file.path().to_str().unwrap(),
Expand Down
Loading