Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recipe for a runtime without FRAME #144

Open
wants to merge 249 commits into
base: master
Choose a base branch
from

Conversation

JoshOrndorff
Copy link
Owner

@JoshOrndorff JoshOrndorff commented Feb 8, 2020

This PR addresses #131 by demonstrating a runtime that does not use Frame. The runtime is as minimal as reasonable and has a single boolean storage value. Transaction can either set, clear, or toggle this storage value. Transaction are not signed.

  • Runtime compiles
  • Outer nodes compile with frameless runtime
  • Writeup
  • Block production works
  • Transactions can be submitted (example of custom transaction encoding)
  • Tests to ensure state transitions correctly. (Maybe completed in a followup)

@JoshOrndorff
Copy link
Owner Author

JoshOrndorff commented Jun 10, 2020

I've pushed a few more commits trying to solve this, and adding debugging output.

@bkchr I've added the transaction handling logic to apply_extrinsic as you suggested. While I'm sure you're correct that missing that logic was a problem, it was not the only problem, or even the worst problem. So far I've only been testing with "empty" blocks (blocks with no extrinsics). But for some reason the state root seems to not match.

@gnunicorn I've added lots of debugging output regarding the state root. I'm not sure what else to do. It seems the error, InvalidStateRoot is coming from here https://github.com/paritytech/substrate/blob/34695a85650b58bcd7d7e2a677cafc2921251d68/client/service/src/client/client.rs#L888 Are you familiar with this code?

Here is the output of creating an empty block using manual seal.

./target/release/manual-seal --dev

2020-06-10 10:00:23 Running in --dev mode, RPC CORS has been disabled.
2020-06-10 10:00:23 Manual- and Instant-Seal Node
2020-06-10 10:00:23 ✌️  version 2.0.0-dev-f0a93da-x86_64-linux-gnu
2020-06-10 10:00:23 ❤️  by Seun LanLege:Joshy Orndorff, 2019-2020
2020-06-10 10:00:23 📋 Chain specification: Development
2020-06-10 10:00:23 🏷  Node name: legal-screw-0234
2020-06-10 10:00:23 👤 Role: AUTHORITY
2020-06-10 10:00:23 💾 Database: RocksDb at /home/joshy/.local/share/manual-seal/chains/dev/db
2020-06-10 10:00:23 ⛓  Native runtime: frameless-runtime-1 (frameless-runtime-1.tx1.au1)
2020-06-10 10:00:23 🔨 Initializing Genesis block/state (state: 0xd7af…61c0, header-hash: 0xe445…6ba9)
Entering initialize_block with Header { parent_hash: 0xe445e538dcbba13c082ae63cbfe03f60340fecf1db5036f5fb21dc6e06fa6ba9, number: 1, state_root: 0x0000000000000000000000000000000000000000000000000000000000000000, extrinsics_root: 0x0000000000000000000000000000000000000000000000000000000000000000, digest: Digest { logs: [] } }
2020-06-10 10:00:23 📦 Highest known block at #0
2020-06-10 10:00:23 Using default protocol ID "sup" because none is configured in the chain specs
2020-06-10 10:00:23 🏷  Local node identity is: 12D3KooWGDUQyGJ9NxcrZzNDDsVeSjqdFTysLmkgAUXZvW2V6gsF (legacy representation: QmdTWSfVdzCurUkMuLsz9XFybMzuYg3ReLBC4XAnLvBgtr)
2020-06-10 10:00:23 〽️ Prometheus server started at 127.0.0.1:9615
2020-06-10 10:00:28 💤 Idle (0 peers), best: #0 (0xe445…6ba9), finalized #0 (0xe445…6ba9), ⬇ 0 ⬆ 0
// Node sits idle here until I send the command to create a block
2020-06-10 10:00:33 💤 Idle (0 peers), best: #0 (0xe445…6ba9), finalized #0 (0xe445…6ba9), ⬇ 0 ⬆ 0
 2020-06-10 10:00:38 💤 Idle (0 peers), best: #0 (0xe445…6ba9), finalized #0 (0xe445…6ba9), ⬇ 0 ⬆ 0
2020-06-10 10:00:42 🙌 Starting consensus session on top of parent 0xe445e538dcbba13c082ae63cbfe03f60340fecf1db5036f5fb21dc6e06fa6ba9
Entering initialize_block with Header { parent_hash: 0xe445e538dcbba13c082ae63cbfe03f60340fecf1db5036f5fb21dc6e06fa6ba9, number: 1, state_root: 0x0000000000000000000000000000000000000000000000000000000000000000, extrinsics_root: 0x0000000000000000000000000000000000000000000000000000000000000000, digest: Digest { logs: [] } }
Entering initialize_block with Header { parent_hash: 0xe445e538dcbba13c082ae63cbfe03f60340fecf1db5036f5fb21dc6e06fa6ba9, number: 1, state_root: 0x0000000000000000000000000000000000000000000000000000000000000000, extrinsics_root: 0x0000000000000000000000000000000000000000000000000000000000000000, digest: Digest { logs: [] } }
Entering finalize_block
State root is: 0x06b55a4949862712e20fab4710a7ea4cacb4e0148b7541ed1668d1de013cdd59
2020-06-10 10:00:42 🎁 Prepared block for proposing at 1 [hash: 0x6cf1b9299dc66147886849675ca525117f2dda05f0d5a221b2d746efeacce592; parent_hash: 0xe445…6ba9; extrinsics (0): []]
Entering execute_block with Block { header: Header { parent_hash: 0xe445e538dcbba13c082ae63cbfe03f60340fecf1db5036f5fb21dc6e06fa6ba9, number: 1, state_root: 0x06b55a4949862712e20fab4710a7ea4cacb4e0148b7541ed1668d1de013cdd59, extrinsics_root: 0x0000000000000000000000000000000000000000000000000000000000000000, digest: Digest { logs: [] } }, extrinsics: [] }
Entering initialize_block with Header { parent_hash: 0xe445e538dcbba13c082ae63cbfe03f60340fecf1db5036f5fb21dc6e06fa6ba9, number: 1, state_root: 0x06b55a4949862712e20fab4710a7ea4cacb4e0148b7541ed1668d1de013cdd59, extrinsics_root: 0x0000000000000000000000000000000000000000000000000000000000000000, digest: Digest { logs: [] } }
Entering finalize_block
State root is: 0x645c64aa51c9780606397f5260e42a5a37976251476282bfcd1d606b2e2d893f
2020-06-10 10:00:42 Block prepare storage changes error:
InvalidStateRoot
2020-06-10 10:00:43 💤 Idle (0 peers), best: #0 (0xe445…6ba9), finalized #0 (0xe445…6ba9), ⬇ 0 ⬆ 0

Some questions / mysteries:

  1. Why is initialize_block called so many times?
  2. What is expected order or phase of calling initialize_block, execute_block, finalize_block, apply_extrinsic?
  3. Why are all the sp-api::Core methods listed as "provided" (rustdocs)? Are they actually provided?
  4. Is storing the header into storage during initialize_block and reading it back out during finalize_block correct? If so, should I be clearing it from storage during finalize_block?
  5. Am I initializing genesis storage correctly? Check the block impl BuildStorage for GenesisConfig.

Verify,
}
};
//TODO this is a weird import. Is this correct?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That is normally done by the macro.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why do you need it?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need it in order to call Self::finalize_block() from with execute_block.

runtimes/frameless-runtime/src/lib.rs Outdated Show resolved Hide resolved
@JoshOrndorff
Copy link
Owner Author

After the last two commits, the state root now matches. This is good.

But the node still crashes after one block. It seems the block is hashing differently during authoring than it is later on.

Logs from basic-pow node

2020-06-10 16:12:17 🎁 Prepared block for proposing at 1 [hash: 0xe32a7fd458e3d6ae3607ff077b7da2e3682147728c0a9b928e4fef8df62d2b5e; parent_hash: 0x4475…34fa; extrinsics (0): []]
Entering initialize_block with Header { parent_hash: 0x4475b9b321ef83f66f20f38f41bd3a0dd9d36df150129a88577535801dbf34fa, number: 1, state_root: 0x0000000000000000000000000000000000000000000000000000000000000000, extrinsics_root: 0x0000000000000000000000000000000000000000000000000000000000000000, digest: Digest { logs: [] } }
2020-06-10 16:12:18 Skipping proposal `can_author_with` returned: Failed to get runtime version at `BlockId::Hash(0xd3c0e04610c0621fc3f31b65d361e7a89c870a18ff6db2c2071f8ec1483f5198)` and will disable authoring. Error: UnknownBlock("State already discarded for BlockId::Hash(0xd3c0e04610c0621fc3f31b65d361e7a89c870a18ff6db2c2071f8ec1483f5198)") Probably a node update is required!
2020-06-10 16:12:18 ✨ Imported #1 (0xd3c0…5198)

Logs from manual-seal node

2020-06-10 16:14:35 🎁 Prepared block for proposing at 1 [hash: 0xc310257801cbe2c025bc32bb98ef32f93a7600ca98c6db86e1a4710964128b19; parent_hash: 0xa7de…0212; extrinsics (0): []]
Entering execute_block with Block { header: Header { parent_hash: 0xa7de808060435caee2ca8f3a8efe3fcab8ce8c96762dfdfa75ac7734bd000212, number: 1, state_root: 0xf7cfc5a31e9ae01b59d48ffdb520e10d76ddf17971d8f2d2895b592bd63919b9, extrinsics_root: 0x0000000000000000000000000000000000000000000000000000000000000000, digest: Digest { logs: [] } }, extrinsics: [] }
Entering initialize_block with Header { parent_hash: 0xa7de808060435caee2ca8f3a8efe3fcab8ce8c96762dfdfa75ac7734bd000212, number: 1, state_root: 0xf7cfc5a31e9ae01b59d48ffdb520e10d76ddf17971d8f2d2895b592bd63919b9, extrinsics_root: 0x0000000000000000000000000000000000000000000000000000000000000000, digest: Digest { logs: [] } }
Entering finalize_block
Returning header: Header { parent_hash: 0xa7de808060435caee2ca8f3a8efe3fcab8ce8c96762dfdfa75ac7734bd000212, number: 1, state_root: 0xf7cfc5a31e9ae01b59d48ffdb520e10d76ddf17971d8f2d2895b592bd63919b9, extrinsics_root: 0x0000000000000000000000000000000000000000000000000000000000000000, digest: Digest { logs: [] } }
2020-06-10 16:14:35 Block prepare storage changes error:
InvalidStateRoot

Comment on lines +272 to +284
Ok(ValidTransaction{
priority: 1u64,
requires: Vec::new(),
// This hach was necessary to make the node accept any transactions at all
// When I was setting provides to an empty vec, submiting a transaction failed
// the RPC responded {"code":-32603,"message":"Unknown error occurred","data":"Pool(NoTagsProvided)"}
// Adding this provides tag solved that. Solutions moving forward:
// 1. Require a nonce with each transaction
// 2. Try to relax the TxPool's requirement that every transaction provide some tag
provides: vec![vec![0]],
longevity: TransactionLongevity::max_value(),
propagate: true,
})
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@todr The intention of this simple runtime is that transactions can be applied in absolutely any order. There are no dependencies between transactions and no meaningful tags to associate with the transactions. But https://github.com/paritytech/substrate/blob/75113aae02ee3cc265aaaa13cdf5a79c7ff98105/client/transaction-pool/graph/src/pool.rs#L410-L412 enforces that each transaction provide at least one tag. Is there any harm in removing this requirement?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops... @tomusdrw

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, somehow missed the previous mention.

The dependencies are indicated via requires tag set, provides guarantees de-duplication of the transactions.
Imagine that we allow no provides tag. The same transaction can then be included in the pool and in the block as many times as you would like - so each time you receive the transaction over network from your peers it would be imported to the pool and added to the block, so you would end up with infinite number of copies of the same transaction.

I'd suggest adding provides: vec![hash(tx.encode())] here. This will de-duplicate transactions floating around the network, but note that it doesn't prevent replaying the same transaction in consecutive blocks.
The runtime must include some kind of replay protection and at some point return Stale from validate_transaction for transactions that have already been applied in the network.

I'd suggest adding at least apply_before_block parameter to each extrinsic, so that we can detect stale transactions. Note that it still does not prevent replaying the same transaction over and over again (in the same block or subsequent blocks) when it's valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A scheme I had in mind to allow both sequential and parallel transactions some time ago looks like this.

  1. Each transaction has (random_salt, nonce).
  2. If you want transactions to be included in parallel you set nonce=0 and choose a random salt.
  3. If you want transactions to be included sequentially you use the same salt and increment the nonce.
  4. Runtime maintains storage entries for:
nonce_expirations: DoubleMap<BlockNumber, Salt, ()>,
next_nonce: Map<Salt, (Nonce, BlockNumber)>,

Salts are valid for fixed X blocks, and every time a salt is used we do something like this:

let expire_at = System::current_block_number() + X;
// Update next expected nonce and pruning block.
let (nonce, block) = next_nonce.get(salt).unwrap_or_else(|| (0, 0));
next_nonce.insert(salt, (nonce + 1, expire_at));
// remove previous expiration notice
nonce_expirations.remove((block, salt));
// insert new expiration
nonce_expirations.insert((expire_at, salt), ());

And then the runtime can use next_nonce to figure out validity of the transaction, and in on_initialize we take all nonce_expirations at that particular block and prune out all the salts (since we use DoubleMap we can get all entries efficiently).

@JoshOrndorff
Copy link
Owner Author

Thanks to Basti's last suggestion we now have block production working. I'm now trying to submit extrinsics. Our extrinsic type is an enum with three variants. According to the scale docs the variants should encode as 0x00, 0x01, and 0x02.

Submitting 0x00 via curl sort of works

$ curl http://localhost:9933 -H "Content-Type:application/json;charset=utf-8" -d   '{
       "jsonrpc":"2.0",
        "id":1,
        "method":"author_submitExtrinsic",
        "params": ["0x00"]
      }'
{"jsonrpc":"2.0","result":"0x03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314","id":1}

But there are two oddities. The first oddity is that submitting 0x01, or 0x02 do not work.

$ curl http://localhost:9933 -H "Content-Type:application/json;charset=utf-8" -d   '{
       "jsonrpc":"2.0",
        "id":1,
        "method":"author_submitExtrinsic",
        "params": ["0x01"]
      }'
{"jsonrpc":"2.0","error":{"code":1001,"message":"Extrinsic has invalid format: Error decoding field OpaqueExtrinsic.0"},"id":1}

The second oddity is that when I submit the apparently working 0x00 extrinsic a second time, the next block contains two copies of it. If I wait several blocks and submit it a third time, the next block has 3 copies of it. I guess maybe this is related to the various copies of the exxtrinsic being indistinguishable?

/// Opaque block header type.
pub type Header = generic::Header<BlockNumber, BlakeTwo256>;
/// Opaque block type.
pub type Block = generic::Block<Header, OpaqueExtrinsic>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoshOrndorff you are using here the OpaqueExtrinsic and that works a little bit different on the node side. AFAIK it has the length prepended, which you not do.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So OpaqueExtrinsic is just an alias for Vec<u8>. And scale docs for Vec<_> do indeed say the encoding is "prefixed with a compact encoding of the number of items, followed by each item's encoding concatenated in turn".

I will be encoding a single byte which means I should prepend with the compact encoding for the number one which is 0x04 according to the same scale docs (in addition to doing the encoding myself, I notices that the number 1 is an example given.

Trying to encode my transaction this way I get the following output

$ curl http://localhost:9933 -H "Content-Type:application/json;charset=utf-8" -d   '{
     "jsonrpc":"2.0",
      "id":1,
      "method":"author_submitExtrinsic",
      "params": ["0x0400"]
    }'
{"jsonrpc":"2.0","error":{"code":1002,"message":"Verification Error: Execution(ApiError(\"Could not convert parameter `tx` between node and runtime: No such variant in enum FramelessTransaction\"))","data":"RuntimeApi(\"Execution(ApiError(\\\"Could not convert parameter `tx` between node and runtime: No such variant in enum FramelessTransaction\\\"))\")"},"id":1}

Although the error message is long, I think this is actually a step in the right direction. It is now clearly trying to decode into FramelessTransaction which is exactly what I want. So the remaining mysteries are:

  1. Now that I know to prepend the length, how do I actually encode the FramelessTransaction variant?
  2. Why did submitting 0x00 ever even partially work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Length of 1 prepended and 1 byte added for your enum variant of the tx.

  2. It is an empty transaction which probably failed in the runtime than.

Copy link
Owner Author

@JoshOrndorff JoshOrndorff Jun 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Length of 1 prepended and 1 byte added for your enum variant of the tx.

I believe that's what I've done. Length of 1 encodes as 0x04 (citation). My enum variant is 0x00.

That transaction fails with the error Verification Error: Execution(ApiError(\"Could not convert parameter tx between node and runtime: No such variant in enum FramelessTransaction\"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you had 0x0400 send to the node? (Not sure if that hex is in correct order)

Copy link
Owner Author

@JoshOrndorff JoshOrndorff Jun 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, submitting 0x0400 is the example quoted above.

I've also tried submitting 0x0004 (which I think is backwards) and it behaves the same as 0x00. This makes sense because the length prefix is 0 so the next byte is ignored.

Base automatically changed from develop to master August 13, 2020 21:32
@JoshOrndorff
Copy link
Owner Author

I'm continuing this work in https://github.com/substrate-developer-hub/substrate-node-template/pull/347. It looks like it's actually processing transactions now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants