Add design proposal for ProgramInstruction procedural macro #10763

CriesofCarrots · 2020-06-24T01:50:39Z

Problem

It's difficult to parse on-chain instructions and get useful information about the expected accounts, as that information is buried in Instruction enum. We could make it easier by using a procedural macro to generate a 2nd enum that exposes account information. This would also have the benefit of enforcing consistency on program Instruction enum docs.

Summary of Changes

Add design proposal for ProgramInstruction procedural macro

garious · 2020-06-24T02:01:21Z

The new syntax looks much nicer, thanks. Seems like there should be an option to generate all the instruction functions. Looks like a reasonable alg would be to snake case the instruction name, then one parameter per account, followed by one per instruction parameter. Coincidentally, those account names will allow us to generate readable parameter names.

garious · 2020-06-24T02:12:04Z

Also, consider generating the enum from an annotated trait definition instead of generating functions from an annotated enum definition. Starting with a trait would better reflect that each instruction effectively defines a function on some set of accounts. Also, individual parameter docs would map cleanly to those account descriptions.

CriesofCarrots · 2020-06-24T02:48:35Z

Also, consider generating the enum from an annotated trait definition

@garious Do you mind sharing some stub code of what you're thinking here?

garious · 2020-06-24T13:19:20Z

Ideally, something like this:

#[program]
pub trait Test {
     /// Consumes a stored nonce, replacing it with a successor
     fn advance_nonce_account(
         /// Nonce account
         #[signer] #[writable]
         nonce_account: &KeyedAccount,

         /// RecentBlockhashes sysvar
         recent_blockhashes_sysvar: &KeyedAccount,
     );
 }

This would generate enum TestInstruction, the instruction::advance_nonce_account(&Pubkey, &Pubkey) -> Instruction constructor, a trait Test with Test::advance_nonce_account, and an implementation of process_instruction that deserializes Instruction::data and calls one of those methods. One the user implements the trait, they're all done - all instruction construction and deserialization is automated.

Anyway, that's probably a lot more than you signed up for, so please just consider that to be the long-term goal, not the first step. If you only add the enum attribute at this time, imagine how that code would be generated from an annotated trait definition. Feels like the annotated enum might be a reasonable target for the annotated trait, such that the annotated enum is responsible for generating the instruction constructors. If so, it'd make sense to move forward with the annotated enum.

CriesofCarrots · 2020-06-24T15:06:13Z

Thanks, @garious! One question: what would the user implement trait Test on?

To me, it makes sense to switch to whatever input type we want to land on now, and add output types incrementally.

That is to say, if we prefer the trait attribute long-term, I think it would work to implement that attribute now to output the regular and verbose enums. Instruction constructors could happen now as well, or as a second step. Automatically generating serializers and/or process_instruction is a bit thornier, since we aren't using the same serialization between native programs and SPL. So I imagine that might be a longer-term goal.

garious · 2020-06-24T16:52:59Z

The trait would be for an empty struct.

I don't think we need to take on the trait approach just yet. Adding attributes to the enum would be an immediate win, because then we could generate most of the functions in today's "instruction" modules. I think I'd prioritize that deduplication before docs generation. Sound like docs generation is then the next step after that. A trait to generate the enum is probably a ways down the road.

garious · 2020-06-24T17:05:21Z

By the way, instead of a trait, adding attributes to a module of functions might be more natural. Then there's no trait to implement. Parity's ink! is a neat example of that.

CriesofCarrots · 2020-06-24T17:43:59Z

By the way, instead of a trait, adding attributes to a module of functions might be more natural. Then there's no trait to implement. Parity's ink! is a neat example of that.

Oh, nice. That seems much cleaner to me than having to implement the trait and declare a bunch of empty methods for no reason.

It seems to me that it would save a bit of work to settle on the input format now. That way we'll only need to write one parser, instead of writing a parser for the tagged enum now and another parser down the road. And it's not any more difficult to parse, say, a module of functions than an enum.
Various deduplication can still happen incrementally by using the parsed data to generate new output and adding it to the stream.

But if you feel strongly about pushing that off, I'll concede in the interest of getting partner work unblocked.

Do you have any thoughts about the considerations in my proposal?

jackcmay · 2020-06-24T18:37:21Z

+1 on consideration #1 depending on what that means for backward compatibility

For consideration #2 how are multi-sig like programs handled? A new instruction for each vaue of M?

CriesofCarrots · 2020-06-24T19:22:24Z

+1 on consideration #1 depending on what that means for backward compatibility

I believe this change should be backward compatible, since SystemInstruction::WithdrawNonceAccount { lamports: u64 } serializes the same as WithdrawNonceAccount::Transfer(u64), etc.

For consideration #2 how are multi-sig like programs handled? A new instruction for each vaue of M?

Great question. I sure would love some brilliant suggestions in this area.
If we went the mod attribute route, lists of accounts could perhaps be supported with a: ident: Vec<&KeyedAccount> parameter.
For the enum attribute route, maybe something like:

#[accounts(
        signers(is_signer = true, desc = "Funding account", allow_multiple = true),
    )]

In either case, mixing a multiple account list with other account declarations would get complicated (around indexing for docs and rpc decoding).

jackcmay · 2020-06-24T21:49:51Z

@CriesofCarrots Referring back to Greg's comments here: #10783

I agree with Greg that tying the comments, the enum, and the instruction constructor together via a generator ensures that they are in sync and stay that way and we should probably approach this effort via that lens.

For the most part, optional accounts should only be used for simple situations and more complex optional account logic should be broken into multiple instructions.

A couple of cases:

Optional additional accounts can be marked as optional in the docs and the constructor can take an Option, where if Some push an AccountMeta for it
Be nice for multiple optional accounts that fall under the same category (multiple signers for example) to be handled as a list and documented together. Doing so would avoid ugly docs and function signatures that have explicit entries for each M.
More complex instructions that require logic to figure out order/representation could either result in the use of Option and leave the logic to the implementation of the instruction, or should probably be made into a separate instruction. Token's NewToken is a good example of that, it a single instruction now but could easily be made into two different instructions, and doing so might actually bring more clarity.

CriesofCarrots · 2020-06-24T23:20:14Z

* Optional additional accounts can be marked as optional in the docs and the constructor can take an `Option`, where if `Some` push an `AccountMeta` for it

Agreed. However, I think we can only support one named optional account per instruction, unless we include some extra data for process_instruction to use to determine which optional account is present, when something between all and none are provided.

* Be nice for multiple optional accounts that fall under the same category (multiple signers for example) to be handled as a list and documented together.  Doing so would avoid ugly docs and function signatures that have explicit entries for each `M`.

Yep, this will also work, but similar to the Optional account, I think we can only support one multiple account collection and no named optional accounts, unless extra data etc.

* More complex instructions that require logic to figure out order/representation could either result in the use of `Option` and leave the logic to the implementation of the instruction, or should probably be made into a separate instruction.  Token's `NewToken` is a good example of that, it a single instruction now but could easily be made into two different instructions, and doing so might actually bring more clarity.

+1, I do think different instructions would add clarity in the NewToken case.

I've updated the doc with a possible example of optional/multiple handling. How is this looking?

CriesofCarrots · 2020-06-24T23:21:24Z